Página 1 de 1

Stringa espressione

Publicado: 05 May 2017, 14:32
por alexdown
Ciao ragazzi, complimenti a tutti per il lavoro svolto, avrei una domanda da fare sto inserendo una nuova fonte nella sezione categorie simile a oggi in tv solo che vorrei prendere da qui netflixlovers.it/novita-su-netflix/ i contenuti.
Ho provato a duplicare il filmontv.py ho modificato il file channelselector.py fin qui tutto ok compare nella sezione, solo al momento di andare a recuperare le informazioni non riesco a scrivere la stringa per leggere il codice html.
Di seguito inserisco la porzione di codice contenete le informazioni che mi interessano.
Qualcuno potrebbe aiutarmi a scrivere la stringa ?
Appena finito posterò il file e li condividerò

grazie

Código: Seleccionar todo

 <li class="col-md-4 col-sm-6 col-xs-12 isotope-item Kids">
        <div class="portfolio-item genre" title="Mune il guardiano della luna (Mune: Guardian of the Moon)">
            <span class="hide">Mune il guardiano della luna (Mune: Guardian of the Moon)</span>
            <a href="/catalogo-netflix-italia/schede/80125818/mune-il-guardiano-della-luna">
                <div class="thumb-info thumb-info-centered-info">
                  <div class="thumb-info-wrapper">
                                                            <img src="https://occ-0-784-778.1.nflxso.net/art/24c21/c15b2d9c1d85943a3dd466ca3e3964da02b24c21.jpg" class="img-responsive" alt="" width="340" height="192">
                    <div class="top10-rating-container">
                      <div class="top10rate" title="Punteggio Netflix Lovers: 3.29">3,2</div>
                      <div class="top10stars text-color-primary">
                        <div class="ratingType">Rating Netflix Lovers</div>
                        <div class='rating_bar_small'>
                          <div class='rating_small' style='width: 65.70%;'>
                          </div>
                        </div>
                      </div>
                    </div>
                    <span class="thumb-info-type">Kids</span>

                    <div class="thumb-info-action">
                      <span class="thumb-info-desc">
                                <span class="thumb-info-caption-text">Quando il vecchio custode della luna va in pensione, viene sostituito dal giovane Mune che, per&#242;, teme di non essere all&#39;altezza del compito.</span>
                                <span class="thumb-info-netflix">Vai alla Scheda</span>
                            </span>
                      <span class="thumb-info-action-icon"><i class="fa fa-file-text"></i></span>
                    </div>

                  </div>
                 
                    <div class="thumb-bottom-title" title="Mune il guardiano della luna (Mune: Guardian of the Moon)">Mune il guardiano della luna</div>
                </div>
            </a>
        </div>
    </li>

Finalmente sono riuscito a scriver un espressione corretta tramite il tools https://regex101.com/#python

Código: Seleccionar todo

<li class="col-md-4 col-sm-6 col-xs-12 isotope-item Film">[^>]+>[^>]+<span class="hide">(.*?)</span>[^>]+>[^>]+[^>]+>[^>]+[^>]+>[^>]+[^>]+>[^>]+[^>]+>[^>]+[^>]+>[^>]+>+[^>]+<img src="(.*?)"[^<]+>
Imagen
ma ancora non funziona carica all'infinito.

Qui di seguito il codice del canale

Código: Seleccionar todo

# -*- coding: utf-8 -*-
# ------------------------------------------------------------
# streamondemand - XBMC Plugin
# Canal para filmontv
# http://www.mimediacenter.info/foro/viewforum.php?f=36
# ------------------------------------------------------------

import re
import urllib

from core import config
from core import logger
from core import scrapertools
from core.item import Item
from core.tmdb import infoSod

__channel__ = "net"
__category__ = "N"
__type__ = "generic"
__title__ = "filmontv.tv (IT)"
__language__ = "IT"

DEBUG = config.get_setting("debug")

host = "http://www.netflixlovers.it"

TIMEOUT_TOTAL = 60


def isGeneric():
    return True


def mainlist(item):
    logger.info("streamondemand.filmontv mainlist")
    itemlist = [Item(channel=__channel__,
                     title="[COLOR red]Serie TV - Azione e Avventura[/COLOR]",
                     action="tvoggi",
                     url="%s/catalogo-netflix-italia/tutti-i-film-cult-su-netflix-italia/" % host,
                     thumbnail="http://a2.mzstatic.com/eu/r30/Purple/v4/3d/63/6b/3d636b8d-0001-dc5c-a0b0-42bdf738b1b4/icon_256.png")]

    return itemlist


def tvoggi(item):
    logger.info("streamondemand.filmontv tvoggi")
    itemlist = []

    # Descarga la pagina
    data = scrapertools.cache_page(item.url)

    # Extrae las entradas (carpetas)
    patron = '<li class="col-md-4 col-sm-6 col-xs-12 isotope-item Film">[^>]+>[^>]+<span class="hide">(.*?)</span>[^>]+>[^>]+[^>]+>[^>]+[^>]+>[^>]+[^>]+>[^>]+[^>]+>[^>]+[^>]+>[^>]+>+[^>]+<img src="(.*?)"[^<]+>'
    matches = re.compile(patron, re.DOTALL).findall(data)

    for scrapedthumbnail, scrapedtitle, scrapedtv in matches:
        scrapedurl = ""
        scrapedtitle = scrapertools.decodeHtmlentities(scrapedtitle).strip()
        if (DEBUG): logger.info("title=[" + scrapedtitle + "], url=[" + scrapedurl + "]")

        itemlist.append(infoSod(
            Item(channel=__channel__,
                 action="do_search",
                 extra=urllib.quote_plus(scrapedtitle) + '{}' + 'movie',
                 title=scrapedtitle + "[COLOR yellow]   " + scrapedtv + "[/COLOR]",
                 fulltitle=scrapedtitle,
                 url=scrapedurl,
                 thumbnail=scrapedthumbnail,
                 folder=True), tipo="movie"))

    return itemlist


# Esta es la función que realmente realiza la búsqueda

def do_search(item):
    from channels import buscador
    return buscador.do_search(item)
Qualcuno può aiutarmi.
Grazie

Re: Stringa espressione

Publicado: 05 May 2017, 22:27
por dentaku65
Questo dovrebbe andare (fixed, updated)

Código: Seleccionar todo

# -*- coding: utf-8 -*-
# ------------------------------------------------------------
# streamondemand - XBMC Plugin
# Netflix search
# http://www.mimediacenter.info/foro/viewforum.php?f=36
# ------------------------------------------------------------

import re
import urllib
import urlparse

from core import config
from core import logger
from core import scrapertools
from core.item import Item
from core.tmdb import infoSod

__channel__ = "netflixsrc"
__category__ = "S"
__type__ = "generic"
__title__ = "netflix search (IT)"
__language__ = "IT"

DEBUG = config.get_setting("debug")

host = "http://www.netflixlovers.it"

TIMEOUT_TOTAL = 60

def isGeneric():
    return True

def mainlist(item):
    logger.info("streamondemand.netflixsrc mainlist")
    itemlist = [Item(channel=__channel__,
                     title="[COLOR red]Serie Netflix[/COLOR]",
                     url="%s/classifiche/top-10-serie-tv-le-migliori-serie-tv-su-netflix-italia/" % host,
                     action="serietv",
                     thumbnail="http://www.netflixlovers.it/img/logo-dark.png?v=2"),
                Item(channel=__channel__,
                     title="[COLOR red]Film Netflix[/COLOR]",
                     action="film",
                     url="%s/classifiche/top-10-film-i-migliori-film-su-netflix-italia/" % host,
                     thumbnail="http://www.netflixlovers.it/img/logo-dark.png?v=2"),
                Item(channel=__channel__,
                     title="[COLOR red]Documentari Netflix[/COLOR]",
                     action="film",
                     url="%s/classifiche/top-10-documentari-i-migliori-documentari-su-netflix-italia/" % host,
                     thumbnail="http://www.netflixlovers.it/img/logo-dark.png?v=2")]

    return itemlist

def serietv(item):
    logger.info("streamondemand.netflixsrc serietv")
    itemlist = []

    # Descarga la pagina
    data = scrapertools.cache_page(item.url)

    # Extrae las entradas (carpetas)
    patron = '<span class="thumb-info-title">\s*<span class="thumb-info-caption-text">([^<]+)</span>'
    matches = re.compile(patron, re.DOTALL).findall(data)

    for scrapedtitle in matches:
        scrapedurl = ""
        scrapedthumbnail = ""
        scrapedtv = "Netflix Original"
        scrapedtitle = scrapertools.decodeHtmlentities(scrapedtitle)
        scrapedtitle = scrapedtitle.split("(")[0]

        itemlist.append(infoSod(
            Item(channel=__channel__,
                 action="do_search",
                 extra=urllib.quote_plus(scrapedtitle) + '{}' + 'serie',
                 title=scrapedtitle + "[COLOR red]   " + scrapedtv + "[/COLOR]",
                 fulltitle=scrapedtitle,
                 url=scrapedurl,
                 thumbnail=scrapedthumbnail,
                 folder=True), tipo="tv"))

    # Extrae el paginador
    patronvideos = '<a href="([^"]+)" class="btn btn-borders btn-primary mr-xs mb-sm center">Mostra'
    matches = re.compile(patronvideos, re.DOTALL).findall(data)

    if len(matches) > 0:
        scrapedurl = urlparse.urljoin(item.url, matches[0])
        itemlist.append(
            Item(channel=__channel__,
                 action="HomePage",
                 title="[COLOR yellow]Torna Home[/COLOR]",
                 folder=True)),
        itemlist.append(
            Item(channel=__channel__,
                 action="serietv",
                 title="[COLOR orange]Successivo >>[/COLOR]",
                 url=scrapedurl,
                 thumbnail="http://2.bp.blogspot.com/-fE9tzwmjaeQ/UcM2apxDtjI/AAAAAAAAeeg/WKSGM2TADLM/s1600/pager+old.png",
                 folder=True))

    return itemlist

def film(item):
    logger.info("streamondemand.netflixsrc film")
    itemlist = []

    # Descarga la pagina
    data = scrapertools.cache_page(item.url)

    # Extrae las entradas (carpetas)
    patron = '<span class="thumb-info-title">\s*<span class="thumb-info-caption-text">([^<]+)</span>'
    matches = re.compile(patron, re.DOTALL).findall(data)

    for scrapedtitle in matches:
        scrapedurl = ""
        scrapedthumbnail = ""
        scrapedtv = "Netflix Original"
        scrapedtitle = scrapertools.decodeHtmlentities(scrapedtitle)
        scrapedtitle = scrapedtitle.split("(")[0]

        itemlist.append(infoSod(
            Item(channel=__channel__,
                 action="do_search",
                 extra=urllib.quote_plus(scrapedtitle) + '{}' + 'movie',
                 title=scrapedtitle + "[COLOR red]   " + scrapedtv + "[/COLOR]",
                 fulltitle=scrapedtitle,
                 url=scrapedurl,
                 thumbnail=scrapedthumbnail,
                 folder=True), tipo="movie"))

    # Extrae el paginador
    patronvideos = '<a href="([^"]+)" class="btn btn-borders btn-primary mr-xs mb-sm center">Mostra'
    matches = re.compile(patronvideos, re.DOTALL).findall(data)

    if len(matches) > 0:
        scrapedurl = urlparse.urljoin(item.url, matches[0])
        itemlist.append(
            Item(channel=__channel__,
                 action="HomePage",
                 title="[COLOR yellow]Torna Home[/COLOR]",
                 folder=True)),
        itemlist.append(
            Item(channel=__channel__,
                 action="film",
                 title="[COLOR orange]Successivo >>[/COLOR]",
                 url=scrapedurl,
                 thumbnail="http://2.bp.blogspot.com/-fE9tzwmjaeQ/UcM2apxDtjI/AAAAAAAAeeg/WKSGM2TADLM/s1600/pager+old.png",
                 folder=True))

    return itemlist

# Esta es la función que realmente realiza la búsqueda

def do_search(item):
    from channels import buscador
    return buscador.do_search(item)

def HomePage(item):
    import xbmc
    xbmc.executebuiltin("ReplaceWindow(10024,plugin://plugin.video.streamondemand)")

Re: Stringa espressione

Publicado: 06 May 2017, 13:39
por alexdown
dentaku65 escribió:
05 May 2017, 22:27
Questo dovrebbe andare (fixed, updated)

Código: Seleccionar todo

# -*- coding: utf-8 -*-
# ------------------------------------------------------------
# streamondemand - XBMC Plugin
# Netflix search
# http://www.mimediacenter.info/foro/viewforum.php?f=36
# ------------------------------------------------------------

import re
import urllib
import urlparse

from core import config
from core import logger
from core import scrapertools
from core.item import Item
from core.tmdb import infoSod

__channel__ = "netflixsrc"
__category__ = "S"
__type__ = "generic"
__title__ = "netflix search (IT)"
__language__ = "IT"

DEBUG = config.get_setting("debug")

host = "http://www.netflixlovers.it"

TIMEOUT_TOTAL = 60

def isGeneric():
    return True

def mainlist(item):
    logger.info("streamondemand.netflixsrc mainlist")
    itemlist = [Item(channel=__channel__,
                     title="[COLOR red]Serie Netflix[/COLOR]",
                     url="%s/classifiche/top-10-serie-tv-le-migliori-serie-tv-su-netflix-italia/" % host,
                     action="serietv",
                     thumbnail="http://www.netflixlovers.it/img/logo-dark.png?v=2"),
                Item(channel=__channel__,
                     title="[COLOR red]Film Netflix[/COLOR]",
                     action="film",
                     url="%s/classifiche/top-10-film-i-migliori-film-su-netflix-italia/" % host,
                     thumbnail="http://www.netflixlovers.it/img/logo-dark.png?v=2"),
                Item(channel=__channel__,
                     title="[COLOR red]Documentari Netflix[/COLOR]",
                     action="film",
                     url="%s/classifiche/top-10-documentari-i-migliori-documentari-su-netflix-italia/" % host,
                     thumbnail="http://www.netflixlovers.it/img/logo-dark.png?v=2")]

    return itemlist

def serietv(item):
    logger.info("streamondemand.netflixsrc serietv")
    itemlist = []

    # Descarga la pagina
    data = scrapertools.cache_page(item.url)

    # Extrae las entradas (carpetas)
    patron = '<span class="thumb-info-title">\s*<span class="thumb-info-caption-text">([^<]+)</span>'
    matches = re.compile(patron, re.DOTALL).findall(data)

    for scrapedtitle in matches:
        scrapedurl = ""
        scrapedthumbnail = ""
        scrapedtv = "Netflix Original"
        scrapedtitle = scrapertools.decodeHtmlentities(scrapedtitle)
        scrapedtitle = scrapedtitle.split("(")[0]

        itemlist.append(infoSod(
            Item(channel=__channel__,
                 action="do_search",
                 extra=urllib.quote_plus(scrapedtitle) + '{}' + 'serie',
                 title=scrapedtitle + "[COLOR red]   " + scrapedtv + "[/COLOR]",
                 fulltitle=scrapedtitle,
                 url=scrapedurl,
                 thumbnail=scrapedthumbnail,
                 folder=True), tipo="tv"))

    # Extrae el paginador
    patronvideos = '<a href="([^"]+)" class="btn btn-borders btn-primary mr-xs mb-sm center">Mostra'
    matches = re.compile(patronvideos, re.DOTALL).findall(data)

    if len(matches) > 0:
        scrapedurl = urlparse.urljoin(item.url, matches[0])
        itemlist.append(
            Item(channel=__channel__,
                 action="HomePage",
                 title="[COLOR yellow]Torna Home[/COLOR]",
                 folder=True)),
        itemlist.append(
            Item(channel=__channel__,
                 action="serietv",
                 title="[COLOR orange]Successivo >>[/COLOR]",
                 url=scrapedurl,
                 thumbnail="http://2.bp.blogspot.com/-fE9tzwmjaeQ/UcM2apxDtjI/AAAAAAAAeeg/WKSGM2TADLM/s1600/pager+old.png",
                 folder=True))

    return itemlist

def film(item):
    logger.info("streamondemand.netflixsrc film")
    itemlist = []

    # Descarga la pagina
    data = scrapertools.cache_page(item.url)

    # Extrae las entradas (carpetas)
    patron = '<span class="thumb-info-title">\s*<span class="thumb-info-caption-text">([^<]+)</span>'
    matches = re.compile(patron, re.DOTALL).findall(data)

    for scrapedtitle in matches:
        scrapedurl = ""
        scrapedthumbnail = ""
        scrapedtv = "Netflix Original"
        scrapedtitle = scrapertools.decodeHtmlentities(scrapedtitle)
        scrapedtitle = scrapedtitle.split("(")[0]

        itemlist.append(infoSod(
            Item(channel=__channel__,
                 action="do_search",
                 extra=urllib.quote_plus(scrapedtitle) + '{}' + 'movie',
                 title=scrapedtitle + "[COLOR red]   " + scrapedtv + "[/COLOR]",
                 fulltitle=scrapedtitle,
                 url=scrapedurl,
                 thumbnail=scrapedthumbnail,
                 folder=True), tipo="movie"))

    # Extrae el paginador
    patronvideos = '<a href="([^"]+)" class="btn btn-borders btn-primary mr-xs mb-sm center">Mostra'
    matches = re.compile(patronvideos, re.DOTALL).findall(data)

    if len(matches) > 0:
        scrapedurl = urlparse.urljoin(item.url, matches[0])
        itemlist.append(
            Item(channel=__channel__,
                 action="HomePage",
                 title="[COLOR yellow]Torna Home[/COLOR]",
                 folder=True)),
        itemlist.append(
            Item(channel=__channel__,
                 action="film",
                 title="[COLOR orange]Successivo >>[/COLOR]",
                 url=scrapedurl,
                 thumbnail="http://2.bp.blogspot.com/-fE9tzwmjaeQ/UcM2apxDtjI/AAAAAAAAeeg/WKSGM2TADLM/s1600/pager+old.png",
                 folder=True))

    return itemlist

# Esta es la función que realmente realiza la búsqueda

def do_search(item):
    from channels import buscador
    return buscador.do_search(item)

def HomePage(item):
    import xbmc
    xbmc.executebuiltin("ReplaceWindow(10024,plugin://plugin.video.streamondemand)")
Ottimo! grazie così funziona, ma io volevo prendere le info da questa url netflixlovers.it/catalogo-netflix-italia/tutte-le-serie-tv-di-azione-e-avventura-su-netflix-italia/ in modo da poterli suddividere per categorie.
E' possibile farlo per questa pagina?
Ps. nel codice non capisco come fai da prendere l'imagine e utilizzarla come poster, potresti darmi qualche delucidazione?

grazie ancora

Re: Stringa espressione

Publicado: 06 May 2017, 18:24
por dentaku65
ciao alexdown,
la locandina viene presa da questa funzione

Código: Seleccionar todo

from core.tmdb import infoSod
che poi e richiamata qui:

Código: Seleccionar todo

itemlist.append(infoSod

Qualora matcha lo scrapedtitle la inserisce se no niente

Per l'altra richiesta credo si possa fare, ma oggi sono stanco :lol: