Página 1 de 2

Work on filmsubito.tv (IT)

Publicado: 24 Ago 2015, 16:10
por zanzibar1982
Hola!

I am working on channel development of this site, filmsubito.tv, as it has a lot of different contents.

Here's a list of issues I need help with:

1) Separating the two sides of the site "FILM STREAMING - NOVITÀ" and "GLI ULTIMI AGGIUNTI"

from home page, as the patron I did extracts them all together.

It also takes a lot of time to load movies, but if I use

Código: Seleccionar todo

patron  = '</span>.*?<a href="(.*?)".*?><img src="(.*?)" width="145"></span><span class="vertical-align"></span></span></a>.*?<p style="font-size:14px;font-weight:bold">(.*?)</p>.*?<p style="font-size:12px;line-height:15px">(.*?)</p>'
it gets a lot faster (but can't get thumbnails from "gli ultimi aggiunti" and search results).

2) Cant' figure out paginador because there are two with same structure.

3) Extract videos from pages (no need to edit new servers I think, at least)

4) Extract tv shows and page them properly.

5) Paging the genres.

Mostly, I lack of time as I began a new job and need to be earlier on site, and get out later :(

Any help will be apreciated, here's the job I did until now:

Código: Seleccionar todo

# -*- coding: utf-8 -*-
#------------------------------------------------------------
# pelisalacarta - XBMC Plugin
# Canal para filmsubito.tv
# http://blog.tvalacarta.info/plugin-xbmc/pelisalacarta/
#------------------------------------------------------------
import urlparse
import re
import sys

from core import logger
from core import config
from core import scrapertools
from core.item import Item
from servers import servertools

__channel__ = "filmsubitotv"
__category__ = "F,A,S,D"
__type__ = "generic"
__title__ = "FilmSubito.tv"
__language__ = "IT"

sito="http://www.filmsubito.tv/"

DEBUG = config.get_setting("debug")


def isGeneric():
    return True


def mainlist(item):
    logger.info("pelisalacarta.filmsubitotv mainlist")
    itemlist = []
    itemlist.append( Item(channel=__channel__, title="[COLOR azure]Home[/COLOR]", action="peliculas", url=sito, thumbnail="http://dc584.4shared.com/img/XImgcB94/s7/13feaf0b538/saquinho_de_pipoca_01"))
    itemlist.append( Item(channel=__channel__, title="[COLOR azure]Serie Anni 80[/COLOR]", action="serie80", url=sito ))
    itemlist.append( Item(channel=__channel__, title="[COLOR yellow]Cerca...[/COLOR]", action="search", thumbnail="http://dc467.4shared.com/img/fEbJqOum/s7/13feaf0c8c0/Search"))

    return itemlist


def serie80(item):
    logger.info("pelisalacarta.filmsubitotv categorias")
    itemlist = []
    
    data = scrapertools.cache_page(item.url)
    logger.info(data)
    
    # The categories are the options for the combo
    patron = '<a href="#" class="dropdown-toggle wide-nav-link" data-toggle="dropdown">Serie anni 80<b class="caret"></b></a>.*?<li class.*? ><a title="(.*?)".*?href="(.*?)">.*?</a></li>'
    matches = re.compile(patron,re.DOTALL).findall(data)
    scrapertools.printMatches(matches)

    for scrapedtitle,scrapedurl in matches:
        #scrapedurl = urlparse.urljoin(item.url,url)
        if (DEBUG): logger.info("title=["+scrapedtitle+"], url=["+scrapedurl+"], thumbnail=["+scrapedthumbnail+"]")
        itemlist.append( Item(channel=__channel__, action="peliculas" ,title=scrapedtitle, url=scrapedurl))

    return itemlist


def peliculas(item):
    logger.info("pelisalacarta.filmsubitotv peliculas")
    itemlist = []

    # Descarga la pagina
    data = scrapertools.cache_page(item.url)

    # Extrae las entradas (carpetas)
    patron  = '</span>.*?<a href="(.*?)".*?><img src="(.*?)".*?width="145"></span><span class="vertical-align"></span></span></a>.*?<p style="font-size:14px;font-weight:bold">(.*?)</p>.*?<p style="font-size:12px;line-height:15px">(.*?)</p>'
    matches = re.compile(patron,re.DOTALL).findall(data)
    scrapertools.printMatches(matches)

    for scrapedurl,scrapedthumbnail,scrapedtitle,scrapedplot in matches:
        if (DEBUG): logger.info("title=["+scrapedtitle+"], url=["+scrapedurl+"], thumbnail=["+scrapedthumbnail+"]")
        itemlist.append( Item(channel=__channel__, action="findvideos", title=scrapedtitle, url=scrapedurl , thumbnail=scrapedthumbnail , plot=scrapedplot, folder=True, fanart=scrapedthumbnail) )

    # Extrae el paginador
    patronvideos  = '<li class="">.*?<a href="(.*?)">&raquo;</a>.*?</li>.*?</ul>'
    matches = re.compile(patronvideos,re.DOTALL).findall(data)
    scrapertools.printMatches(matches)

    if len(matches)>0:
        scrapedurl = urlparse.urljoin(item.url,matches[0])
        itemlist.append( Item(channel=__channel__, extra=item.extra, action="peliculas", title="[COLOR orange]Successivo>>[/COLOR]" , url=scrapedurl , thumbnail="http://2.bp.blogspot.com/-fE9tzwmjaeQ/UcM2apxDtjI/AAAAAAAAeeg/WKSGM2TADLM/s1600/pager+old.png", folder=True) )

    return itemlist


def search(item,texto):
    logger.info("[filmsubitotv.py] "+item.url+" search "+texto)
    item.url = "http://www.filmsubito.tv/search.php?keywords="+texto
    try:
        return peliculas(item)
    # Se captura la excepción, para no interrumpir al buscador global si un canal falla
    except:
        import sys
        for line in sys.exc_info():
            logger.error( "%s" % line )
        return []

Re: Work on filmsubito.tv (IT)

Publicado: 27 Ago 2015, 13:27
por zanzibar1982
so I was able to build a menù structure for filmsubito.tv, but still can't extract videos from embed players.

Here's the code:

Código: Seleccionar todo

# -*- coding: utf-8 -*-
#------------------------------------------------------------
# pelisalacarta - XBMC Plugin
# Canal para filmsubito.tv
# http://blog.tvalacarta.info/plugin-xbmc/pelisalacarta/
#------------------------------------------------------------
import urlparse
import re
import sys

from core import logger
from core import config
from core import scrapertools
from core.item import Item
from servers import servertools

__channel__ = "filmsubitotv"
__category__ = "F,A,S"
__type__ = "generic"
__title__ = "FilmSubito.tv"
__language__ = "IT"

sito="http://www.filmsubito.tv/"

DEBUG = config.get_setting("debug")


def isGeneric():
    return True


def mainlist(item):
    logger.info("pelisalacarta.filmsubitotv mainlist")
    itemlist = []
    itemlist.append( Item(channel=__channel__, title="[COLOR azure]Film - Novità[/COLOR]", action="peliculas", url=sito+"film-2015-streaming.html?&page=2", thumbnail="http://dc584.4shared.com/img/XImgcB94/s7/13feaf0b538/saquinho_de_pipoca_01"))
    itemlist.append( Item(channel=__channel__, title="[COLOR azure]Film per Genere[/COLOR]", action="genere", url=sito ))
    itemlist.append( Item(channel=__channel__, title="[COLOR azure]Film per Anno[/COLOR]", action="anno", url=sito ))
    itemlist.append( Item(channel=__channel__, title="[COLOR azure]Serie TV degli anni '80[/COLOR]", action="serie80", url=sito ))
    itemlist.append( Item(channel=__channel__, title="[COLOR azure]Cartoni animati degli anni '80[/COLOR]", action="cartoni80", url=sito ))
    itemlist.append( Item(channel=__channel__, title="[COLOR azure]Documentari[/COLOR]", action="documentari", url=sito ))
    itemlist.append( Item(channel=__channel__, title="[COLOR yellow]Cerca...[/COLOR]", action="search", thumbnail="http://dc467.4shared.com/img/fEbJqOum/s7/13feaf0c8c0/Search"))

    return itemlist

def search(item,texto):
    logger.info("[filmsubitotv.py] "+item.url+" search "+texto)
    item.url = "http://www.filmsubito.tv/search.php?keywords="+texto
    try:
        return peliculas(item)
    # Se captura la excepción, para no interrumpir al buscador global si un canal falla
    except:
        import sys
        for line in sys.exc_info():
            logger.error( "%s" % line )
        return []

def peliculas(item):
    logger.info("pelisalacarta.filmsubitotv peliculas")
    itemlist = []

    # Descarga la pagina
    data = scrapertools.cache_page(item.url)

    # Extrae las entradas (carpetas)
    patron = '</span>.*?<a href="(.*?)" class="pm-thumb-fix pm-thumb-145">.*?><img src="(.*?)" alt="(.*?)" width="145">'
    matches = re.compile(patron,re.DOTALL).findall(data)
    scrapertools.printMatches(matches)

    for scrapedurl,scrapedthumbnail,scrapedtitle in matches:
        title = scrapertools.decodeHtmlentities( scrapedtitle )
        if (DEBUG): logger.info("title=["+scrapedtitle+"], url=["+scrapedurl+"], thumbnail=["+scrapedthumbnail+"]")
        itemlist.append( Item(channel=__channel__, action="findvideos", title=title, url=scrapedurl , thumbnail=scrapedthumbnail, folder=True, fanart=scrapedthumbnail) )

    # Extrae el paginador
    patronvideos = '<a href="([^"])">&raquo;</a>'
    matches = re.compile(patronvideos,re.DOTALL).findall(data)
    scrapertools.printMatches(matches)

    if len(matches)>0:
        scrapedurl = urlparse.urljoin(item.url,matches[0])
        itemlist.append( Item(channel=__channel__, extra=item.extra, action="peliculas", title="[COLOR orange]Successivo>>[/COLOR]" , url=scrapedurl , thumbnail="http://2.bp.blogspot.com/-fE9tzwmjaeQ/UcM2apxDtjI/AAAAAAAAeeg/WKSGM2TADLM/s1600/pager+old.png", folder=True) )

    return itemlist

def genere(item):
    logger.info("[filmsubitotv.py] genere")
    itemlist = []

    data = scrapertools.cachePage(item.url)
    logger.info("data="+data)

    data = scrapertools.find_single_match(data,'<a href="#" class="dropdown-toggle wide-nav-link" data-toggle="dropdown">Genere <b class="caret"></b></a>(.*?)<li><a href="http://www.filmsubito.tv/film-2015-streaming.html" class="wide-nav-link">Novità</a></li>')
    logger.info("data="+data)

    patron  = '<a.*?href="(.*?)" class=".*?>(.*?)</a>'
    matches = re.compile(patron,re.DOTALL).findall(data)
    scrapertools.printMatches(matches)

    for scrapedurl,scrapedtitle in matches:
        title = scrapertools.decodeHtmlentities( scrapedtitle )
        itemlist.append( Item(channel=__channel__, action="peliculas", title=title, url=scrapedurl, folder=True))

    return itemlist

def serie80(item):
    logger.info("[filmsubitotv.py] genere")
    itemlist = []

    data = scrapertools.cachePage(item.url)
    logger.info("data="+data)

    data = scrapertools.find_single_match(data,'<a href="#" class="dropdown-toggle wide-nav-link" data-toggle="dropdown">Serie anni 80<b class="caret"></b></a>(.*?)<li class="dropdown">')
    logger.info("data="+data)

    patron  = '<a.*?href="(.*?)">(.*?)</a></li>'
    matches = re.compile(patron,re.DOTALL).findall(data)
    scrapertools.printMatches(matches)

    for scrapedurl,scrapedtitle in matches:
        title = scrapertools.decodeHtmlentities( scrapedtitle )
        itemlist.append( Item(channel=__channel__, action="peliculas", title=title, url=scrapedurl, folder=True))

    return itemlist

def anno(item):
    logger.info("[filmsubitotv.py] genere")
    itemlist = []

    data = scrapertools.cachePage(item.url)
    logger.info("data="+data)

    data = scrapertools.find_single_match(data,'<a href="#" class="dropdown-toggle wide-nav-link" data-toggle="dropdown">Anno<b class="caret"></b></a>(.*?)<li class="dropdown">')
    logger.info("data="+data)

    patron  = '<a.*?href="(.*?)">(.*?)</a></li>'
    matches = re.compile(patron,re.DOTALL).findall(data)
    scrapertools.printMatches(matches)

    for scrapedurl,scrapedtitle in matches:
        title = scrapertools.decodeHtmlentities( scrapedtitle )
        itemlist.append( Item(channel=__channel__, action="peliculas", title=title, url=scrapedurl, folder=True))

    return itemlist

def cartoni80(item):
    logger.info("[filmsubitotv.py] genere")
    itemlist = []

    data = scrapertools.cachePage(item.url)
    logger.info("data="+data)

    data = scrapertools.find_single_match(data,'<a href="#" class="dropdown-toggle wide-nav-link" data-toggle="dropdown">Cartoni anni 80<b class="caret"></b></a>(.*?)<li class="dropdown">')
    logger.info("data="+data)

    patron  = '<a.*?href="(.*?)">(.*?)</a></li>'
    matches = re.compile(patron,re.DOTALL).findall(data)
    scrapertools.printMatches(matches)

    for scrapedurl,scrapedtitle in matches:
        title = scrapertools.decodeHtmlentities( scrapedtitle )
        itemlist.append( Item(channel=__channel__, action="peliculas", title=title, url=scrapedurl, folder=True))

    return itemlist

def documentari(item):
    logger.info("[filmsubitotv.py] genere")
    itemlist = []

    data = scrapertools.cachePage(item.url)
    logger.info("data="+data)

    data = scrapertools.find_single_match(data,'<a href="#" class="dropdown-toggle wide-nav-link" data-toggle="dropdown">Documentari<b class="caret"></b></a>(.*?)<li class="dropdown">')
    logger.info("data="+data)

    patron  = '<a.*?href="(.*?)">(.*?)</a></li>'
    matches = re.compile(patron,re.DOTALL).findall(data)
    scrapertools.printMatches(matches)

    for scrapedurl,scrapedtitle in matches:
        title = scrapertools.decodeHtmlentities( scrapedtitle )
        itemlist.append( Item(channel=__channel__, action="peliculas", title=title, url=scrapedurl, folder=True))

    return itemlist

def serie(item):
    logger.info("pelisalacarta.filmsubitotv peliculas")
    itemlist = []

    # Descarga la pagina
    data = scrapertools.cache_page(item.url)

    # Extrae las entradas (carpetas)
    patron = '</span>.*?<a href="(.*?)" class="pm-thumb-fix pm-thumb-145">.*?"><img.*?src="(.*?)" title="Young and Hungry " alt="(.*?)" width="145">'
    matches = re.compile(patron,re.DOTALL).findall(data)
    scrapertools.printMatches(matches)

    for scrapedurl,scrapedthumbnail,scrapedtitle in matches:
        title = scrapertools.decodeHtmlentities( scrapedtitle )
        if (DEBUG): logger.info("title=["+scrapedtitle+"], url=["+scrapedurl+"], thumbnail=["+scrapedthumbnail+"]")
        itemlist.append( Item(channel=__channel__, action="findvideos", title=title, url=scrapedurl , thumbnail=scrapedthumbnail, folder=True, fanart=scrapedthumbnail) )

    # Extrae el paginador
    patronvideos = '<a href="([^"])">&raquo;</a>'
    matches = re.compile(patronvideos,re.DOTALL).findall(data)
    scrapertools.printMatches(matches)

    if len(matches)>0:
        scrapedurl = urlparse.urljoin(item.url,matches[0])
        itemlist.append( Item(channel=__channel__, extra=item.extra, action="peliculas", title="[COLOR orange]Successivo>>[/COLOR]" , url=scrapedurl , thumbnail="http://2.bp.blogspot.com/-fE9tzwmjaeQ/UcM2apxDtjI/AAAAAAAAeeg/WKSGM2TADLM/s1600/pager+old.png", folder=True) )

    return itemlist

Re: Work on filmsubito.tv (IT)

Publicado: 28 Ago 2015, 17:41
por robalo
Hola zanzibar

Para crear/extraer los enlaces intenta impletar esto:

Código: Seleccionar todo

servers = {
    '2':'http://embed.nowvideo.li/embed.php?v=%s'
    '16':'http://youwatch.org/embed-%s-640x360.html',
    '22':'http://www.exashare.com/embed-%s-700x400.html',
    '23':'http://videomega.tv/cdn.php?ref=%s&width=700&height=430',
    '29':'http://embed.novamov.com/embed.php?v=%s',
}
patron = "=.setupNewPlayer.('([^']+)','(\d+)'"
matches = re.compile( patron, re.DOTALL ).findall( data )
for video_id, i in matches:
    scrapedurl = servers[i] % video_id

Re: Work on filmsubito.tv (IT)

Publicado: 28 Ago 2015, 23:07
por zanzibar1982
excuse me robalo, but where? should I put "servers" as an action instead of "findvideos"?
thank you

Edit: I understand that this function builds url links joining the servers' and the video id :) this is very handy

EDIT 2:

So this is what I build in the channel:

Código: Seleccionar todo

def peliculas(item):
    logger.info("pelisalacarta.filmsubitotv peliculas")
    itemlist = []

    # Descarga la pagina
    data = scrapertools.cache_page(item.url)

    # Extrae las entradas (carpetas)
    patron = '</span>.*?<a href="(.*?)" class="pm-thumb-fix pm-thumb-145">.*?><img src="(.*?)" alt="(.*?)" width="145">'
    matches = re.compile(patron,re.DOTALL).findall(data)
    scrapertools.printMatches(matches)

    for scrapedurl,scrapedthumbnail,scrapedtitle in matches:
        title = scrapertools.decodeHtmlentities( scrapedtitle )
        if (DEBUG): logger.info("title=["+scrapedtitle+"], url=["+scrapedurl+"], thumbnail=["+scrapedthumbnail+"]")
        itemlist.append( Item(channel=__channel__, action="findvid", title=title, url=scrapedurl , thumbnail=scrapedthumbnail, folder=True, fanart=scrapedthumbnail) )

    # Extrae el paginador
    patronvideos = '<a href="([^"])">&raquo;</a>'
    matches = re.compile(patronvideos,re.DOTALL).findall(data)
    scrapertools.printMatches(matches)

    if len(matches)>0:
        scrapedurl = urlparse.urljoin(item.url,matches[0])
        itemlist.append( Item(channel=__channel__, extra=item.extra, action="peliculas", title="[COLOR orange]Successivo>>[/COLOR]" , url=scrapedurl , thumbnail="http://2.bp.blogspot.com/-fE9tzwmjaeQ/UcM2apxDtjI/AAAAAAAAeeg/WKSGM2TADLM/s1600/pager+old.png", folder=True) )

    return itemlist

def findvid( item ):
    logger.info( "[filmsubitotv.py] findvideos" )

    itemlist = []
    data = scrapertools.cache_page( item.url )
    servers = {
        '2':'http://embed.nowvideo.li/embed.php?v=%s',
        '16':'http://youwatch.org/embed-%s-640x360.html',
        '22':'http://www.exashare.com/embed-%s-700x400.html',
        '23':'http://videomega.tv/cdn.php?ref=%s&width=700&height=430',
        '29':'http://embed.novamov.com/embed.php?v=%s',
        }
    patron = "=.setupNewPlayer.('([^']+)','(\d+)'"
    matches = re.compile( patron, re.DOTALL ).findall( data )
    
    for video_id, i in matches:
        scrapedurl = servers[i] % video_id

        itemlist.append( Item( channel=__channel__, action="play" , title=title, url=scrapedurl, thumbnail=item.thumbnail, fulltitle=item.fulltitle, show=item.show, folder=False ) )

def play( item ):
    logger.info( "[filmsubitovt.py] play" )

    ## Sólo es necesario la url
    data = item.url

    itemlist = servertools.find_video_items( data=data )

    for videoitem in itemlist:
        videoitem.title = item.show
        videoitem.fulltitle = item.fulltitle
        videoitem.thumbnail = item.thumbnail
        videoitem.channel = __channel__

    return itemlist
but I get an error in the log, trying to open videos for "Minions" movie:

Código: Seleccionar todo

09:24:21 T:3388   ERROR: EXCEPTION Thrown (PythonToCppException) : -->Python callback/script returned the following error<--
                                             - NOTE: IGNORING THIS CAN LEAD TO MEMORY LEAKS!
                                            Error Type: <class 'sre_constants.error'>
                                            Error Contents: unbalanced parenthesis
                                            Traceback (most recent call last):
                                              File "C:\Users\Barracuda\AppData\Roaming\Kodi\addons\pelisalacarta-italian-channels\default.py", line 27, in <module>
                                                launcher.run()
                                              File "C:\Users\Barracuda\AppData\Roaming\Kodi\addons\pelisalacarta-italian-channels\platformcode\xbmc\launcher.py", line 308, in run
                                                exec "itemlist = channel."+action+"(item)"
                                              File "<string>", line 1, in <module>
                                              File "C:\Users\Barracuda\AppData\Roaming\Kodi\addons\pelisalacarta-italian-channels\pelisalacarta\channels\filmsubitotv.py", line 98, in findvid
                                                matches = re.compile( patron, re.DOTALL ).findall( data )
                                              File "C:\Program Files (x86)\Kodi\system\python\Lib\re.py", line 190, in compile
                                                return _compile(pattern, flags)
                                              File "C:\Program Files (x86)\Kodi\system\python\Lib\re.py", line 244, in _compile
                                                raise error, v # invalid expression
                                            error: unbalanced parenthesis
                                            -->End of Python script error report<--

Re: Work on filmsubito.tv (IT)

Publicado: 29 Ago 2015, 08:27
por robalo
Si que es práctico :)

servers tiene un error, le falta una ',' al final de la primera línea, se me pasó ponerla al ordenar las líneas.

La forma más fácil de implementarlo es creando la función 'findvideos'. Siguiendo vuestra línea de trabajo y usando el estilo de DrZ3r0 podría ser así

Código: Seleccionar todo

def findvideos( item ):
    logger.info( "[filmsubitotv.py] findvideos" )

    ## Descarga la página
    data = scrapertools.cache_page( item.url )

    ## ---------------------------------------------------------------
    servers = {
        '2':'http://embed.nowvideo.li/embed.php?v=%s',
        '16':'http://youwatch.org/embed-%s-640x360.html',
        '22':'http://www.exashare.com/embed-%s-700x400.html',
        '23':'http://videomega.tv/cdn.php?ref=%s&width=700&height=430',
        '29':'http://embed.novamov.com/embed.php?v=%s'
    }

    patron = "=.setupNewPlayer.'([^']+)','(\d+)'"
    matches = re.compile( patron, re.DOTALL ).findall( data )

    data = ""
    for video_id, i in matches:
        try: data+= servers[i] % video_id + "\n"
        except: pass
    ## ---------------------------------------------------------------

    itemlist = servertools.find_video_items(data=data)

    for videoitem in itemlist:
        videoitem.title = "".join([item.title, videoitem.title])
        videoitem.fulltitle = item.fulltitle
        videoitem.thumbnail = item.thumbnail
        videoitem.show = item.show
        videoitem.channel = __channel__

    return itemlist

Re: Work on filmsubito.tv (IT)

Publicado: 29 Ago 2015, 08:36
por zanzibar1982
Good morning, robalo

Código: Seleccionar todo

def findvideos( item ):
    logger.info( "[filmsubitotv.py] findvideos" )

    ## Descarga la página
    data = scrapertools.cache_page( item.url )

    ## ---------------------------------------------------------------
    servers = {
        '2':'http://embed.nowvideo.li/embed.php?v=%s',
        '16':'http://youwatch.org/embed-%s-640x360.html',
        '22':'http://www.exashare.com/embed-%s-700x400.html',
        '23':'http://videomega.tv/cdn.php?ref=%s&width=700&height=430',
        '29':'http://embed.novamov.com/embed.php?v=%s'
    }

    patron = "=.setupNewPlayer.'([^']+)','(\d+)'"
    matches = re.compile( patron, re.DOTALL ).findall( data )

    data = ""
    for video_id, i in matches:
        try: data+= servers[i] % video_id + "\n"
        except: pass
    ## ---------------------------------------------------------------

    itemlist = servertools.find_video_items(data=data)

    for videoitem in itemlist:
        videoitem.title = "".join([item.title, videoitem.title])
        videoitem.fulltitle = item.fulltitle
        videoitem.thumbnail = item.thumbnail
        videoitem.show = item.show
        videoitem.channel = __channel__

    return itemlist
This returns a blank page, hmmm... I'll work on this as soon as I get back from work.

Re: Work on filmsubito.tv (IT)

Publicado: 29 Ago 2015, 10:10
por robalo
En películas he modificado el patrón por que no me entregaba bien la url para 'findvideos'

Código: Seleccionar todo

    # Extrae las entradas (carpetas)
    #patron = '</span>.*?<a href="(.*?)" class="pm-thumb-fix pm-thumb-145">.*?><img src="(.*?)" alt="(.*?)" width="145">'
    patron = '<span class="pm-video-li-thumb-info".*?'
    patron+= 'href="([^"]+)".*?'
    patron+= 'src="([^"]+)" '
    patron+= 'alt="([^"]+)"'
La función 'play' no hace falta

Re: Work on filmsubito.tv (IT)

Publicado: 29 Ago 2015, 13:44
por zanzibar1982
That really works :)

I am using youwatch server from 4.0.2:

Código: Seleccionar todo

# -*- coding: utf-8 -*-
#------------------------------------------------------------
# pelisalacarta - XBMC Plugin
# Conector para gamovideo
# http://blog.tvalacarta.info/plugin-xbmc/pelisalacarta/
#------------------------------------------------------------

import urlparse,urllib2,urllib,re
import os

from core import scrapertools
from core import logger
from core import config
from core import unpackerjs3

def test_video_exists( page_url ):
    logger.info("youwatch test_video_exists(page_url='%s')" % page_url)
    return True,""

def get_video_url( page_url , premium = False , user="" , password="", video_password="" ):
    logger.info("youwatch get_video_url(page_url='%s')" % page_url)
    if not "embed" in page_url:
      page_url = page_url.replace("http://youwatch.org/","http://youwatch.org/embed-") + ".html"

    data = scrapertools.cache_page(page_url)
    data = scrapertools.find_single_match(data,"<span id='flvplayer'></span>\n<script type='text/javascript'>(.*?)\n;</script>")
    data = unpackerjs3.unpackjs(data,0)
    url = scrapertools.get_match(data, 'file:"([^"]+)"')
    video_urls = []
    video_urls.append([scrapertools.get_filename_from_url(url)[-4:]+" [youwatch]",url])

    for video_url in video_urls:
        logger.info("[youwatch.py] %s - %s" % (video_url[0],video_url[1]))
        

    return video_urls

# Encuentra vídeos del servidor en el texto pasado
def find_videos(data):
    encontrados = set()
    devuelve = []


    patronvideos  = 'http://youwatch.org/([a-z0-9]+)'
    logger.info("youwatch find_videos #"+patronvideos+"#")
    matches = re.compile(patronvideos,re.DOTALL).findall(data)

    for match in matches:
        titulo = "[youwatch]"
        url = "http://youwatch.org/"+match
        if url not in encontrados and match!="embed":
            logger.info("  url="+url)
            devuelve.append( [ titulo , url , 'youwatch' ] )
            encontrados.add(url)
        else:
            logger.info("  url duplicada="+url)
            

    patronvideos  = 'http://youwatch.org/embed-([a-z0-9]+)'
    logger.info("youwatch find_videos #"+patronvideos+"#")
    matches = re.compile(patronvideos,re.DOTALL).findall(data)

    for match in matches:
        titulo = "[youwatch]"
        url = "http://youwatch.org/"+match
        if url not in encontrados:
            logger.info("  url="+url)
            devuelve.append( [ titulo , url , 'youwatch' ] )
            encontrados.add(url)
        else:
            logger.info("  url duplicada="+url)
            
    return devuelve

def test():
    video_urls = get_video_url("http://youwatch.org/crbt4sja1jvo")

    return len(video_urls)>0
Ant trying to open this http://www.filmsubito.tv/c7197ad19/quan ... rcrew.html

I get "se ha producido un error con el conector youwatch http://youwatch.org/efrwqm1viat7"

I'm trying to contact the web developer to push uploading videos on other servers :D

Re: Work on filmsubito.tv (IT)

Publicado: 29 Ago 2015, 15:48
por zanzibar1982
Excuse me, but it might be the tremendous heat of these days:

I can't extract the paginador :|

Código: Seleccionar todo

    # Extrae el paginador
    patronvideos = '<li class="">.*?<a href="(.*?)">&raquo;</a>.*?</li>.*?</ul>.*?</div>'
    matches = re.compile(patronvideos,re.DOTALL).findall(data)
    scrapertools.printMatches(matches)

    if len(matches)>0:
        scrapedurl = urlparse.urljoin(item.url,matches[0])
        itemlist.append( Item(channel=__channel__, extra=item.extra, action="peliculas", title="[COLOR orange]Successivo>>[/COLOR]" , url=scrapedurl , thumbnail="http://2.bp.blogspot.com/-fE9tzwmjaeQ/UcM2apxDtjI/AAAAAAAAeeg/WKSGM2TADLM/s1600/pager+old.png", folder=True) )

    return itemlist

Re: Work on filmsubito.tv (IT)

Publicado: 29 Ago 2015, 19:37
por robalo
veo que te está mareando por una tontería :)

Como lo tenías antes estaba bien o casi bien.

Código: Seleccionar todo

    patronvideos = '<a href="([^"])">&raquo;</a>'
Sólo le falta el caracter '+' en '"([^"])"'.

Estas cosas a mí me pasa mucho, al principio me volvía loco buscando el problema :)