Understanding Urlparser

urlparser.py

Role There are two type of url we can found in sites:

Direct urls: they directly address to stream or m3u8 playlist and then they are passed to player without any other operation.
Urls of video server (e.g. Youtube, Openload, Verystream, Videoweed, Nowvideo, ....) They open a page in server and show video only after some operations (viewing ads, answering questions, pressing play on player). The role of urlparser is to search direct urls in server pages, without any operation of user. This is a difficult role, because programmers of video servers want users to watch their pages in browser and to reach videos only after ads. So direct urls are obfuscated, sometimes crypted. Good news are that once written the method to get urls from a server, this can be used by all hosts that contains urls toward this server.

Structure urlparser.py is a very big file in folder IPTVPlayer/libs that is a huge switch statement: a big dict shows all domains that are managed by it.

self.hostMap = {
                       'lookmovie.ag' :         self.pp.parserLOOKMOVIE,
                       '1fichier.com':          self.pp.parser1FICHIERCOM    ,
                       '1tv.ru':                self.pp.parser1TVRU          ,
                       '37.220.36.15':          self.pp.parserMOONWALKCC    ,
                       '7cast.net':             self.pp.parser7CASTNET      ,
                       'abcast.biz':            self.pp.parserABCASTBIZ     ,
                       'abcast.net':            self.pp.parserABCASTBIZ     ,
                       'aflamyz.com':           self.pp.parserAFLAMYZCOM     ,
                       'akvideo.stream':        self.pp.parserAKVIDEOSTREAM ,
...............
                       'yourvideohost.com':     self.pp.parserYOURVIDEOHOST ,
                       'youtu.be':              self.pp.parserYOUTUBE       ,
                       'youtube.com':           self.pp.parserYOUTUBE       ,
                       'youtube-nocookie.com':  self.pp.parserYOUTUBE       ,
                       'youwatch.org':          self.pp.parserYOUWATCH      ,
                       'yukons.net':            self.pp.parserYUKONS        ,
                       'zalaa.com':             self.pp.parserZALAACOM      ,
                       'zerocast.tv':           self.pp.parserZEROCASTTV    ,
                       'zstream.to':            self.pp.parserZSTREAMTO      
        }

The second part of couples are names of functions called to find direct urls.

Adding a new server

In previous example we have found this url https://onlystream.tv/e/6pvw6ticrzlg/, but urlparser couldn't recognize it, as shown in log

_________________getHostName: [https://onlystream.tv/e/6pvw6ticrzlg/] -> [onlystream.tv]
urlparser.getParser II try host[onlystream.tv]->host2[tv]
https://onlystream.tv/e/6pvw6ticrzlg/

so if we want to see videos from onlystream, we shall add this server to urlparser. So let's add a line in dict

                       'onet.pl':               self.pp.parserONETTV        ,
                       'onet.tv':               self.pp.parserONETTV        ,
                       'onlystream.tv':         self.pp.parserONLYSTREAM    ,
                       'openlive.org':          self.pp.parserOPENLIVEORG   ,

and let's start to code its function

        def parserONLYSTREAM(self, baseUrl):
        printDBG("parserONLYSTREAM baseUrl[%s]" % baseUrl)
        
        # do something
        return []

Let's try to code a function for server "onlystream.tv" Opening a page, we obtain a code similar to 🔗this. Page contains a lot of javascript code and it is typical with this kind of pages, because javascript routines modify dinamically own pages and open new windows for showing us annoying ads. Looking for url, luckily we can found these lines

  jwplayer("vplayer").setup({
    sources: [{file:"https://t29old.ostreamcdn.com/u5kj6x6qcdhlsdgge7dggmqcjyxf7jc2d3vrlxtncj2jbw524ykihw7ie43q/v.mp4",label:"720p"}],
     "logo": {
    "file": "//cdn.onlystream.tv/images/player-logo.png",
    "link": "https://onlystream.tv",
    "hide": "false",
    "position": "control-bar"
  },

and these ones

tracks: [{file: "https://onlystream.tv/dl?op=get_slides&length=5906.76&url=https://t29old.ostreamcdn.com/i/01/00195/dtf032mrwm3y0000.jpg", kind: "thumbnails"},
{file: "https://onlystream.tv/srt/00324/6pvw6ticrzlg_Serbian.vtt", label: "Serbian", kind: "captions","default": true},
{file: "https://onlystream.tv/srt/empty.srt", label: "Upload SRT", kind: "captions"}]

that contain direct url of stream and an url for subtitles.

Through regular expressions, we can grab useful urls. Final function is:

    def parserONLYSTREAM(self, baseUrl):
        printDBG("parserONLYSTREAM baseUrl[%s]" % baseUrl)

        def checkTxt(txt):
            txt = txt.replace('\n', ' ')
            if txt.find('file:'):
                txt = txt.replace('file:', '"file":')
            if txt.find('label:'):
                txt = txt.replace('label:', '"label":')
            if txt.find('kind:'):
                txt = txt.replace('kind:', '"kind":')
            return txt
                
        sts, data = self.cm.getPage(baseUrl)
        if not sts:
            return []

        urlsTab=[]
        subTracks = []
        
        # subtitles search
        t = re.findall("tracks: \[(.*?)\]", data, re.S)
        if t:
            txt = checkTxt("[" + t[0] + "]")
            printDBG(txt)
            tracks = json_loads(txt)
            printDBG(str(tracks))
            
            for tr in tracks:
                if tr.get('kind','') == 'captions':
                    printDBG(str(tr))
                    srtUrl = tr.get('file','')
                    if srtUrl != '' and not ('empty.srt' in srtUrl):
                        label = tr.get('label', 'srt')
                        srtFormat = srtUrl[-3:]
                        params = {'title': label, 'url': srtUrl, 'lang': label.lower()[:3], 'format': srtFormat}
                        printDBG(str(params))
                        subTracks.append(params)
                    
        # stream search
        s = re.findall("sources: \[(.*?)\]", data, re.S)
        if not s:
            return []
        
        txt = checkTxt("[" + s[0] + "]")
        printDBG(txt)
        
        links = json_loads(txt)
        #printDBG(str(links))
        for l in links:
            if 'file' in l:
                url = urlparser.decorateUrl(l['file'], {'Referer' : baseUrl, 'external_sub_tracks':subTracks})
                params = {'name': l.get('label', 'link') , 'url': url}
                printDBG(params)
                urlsTab.append(params)
        
        return urlsTab

Now a new server is added to urlparser and we can watch videos in e2iplayer from it!

thx for this original Text to Maxbambi

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Understanding Urlparser

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally