-
Notifications
You must be signed in to change notification settings - Fork 13
Understanding Urlparser
- urlparser.py
Role There are two type of url we can found in sites:
- Direct urls: they directly address to stream or m3u8 playlist and then they are passed to player without any other operation.
- Urls of video server (e.g. Youtube, Openload, Verystream, Videoweed, Nowvideo, ....) They open a page in server and show video only after some operations (viewing ads, answering questions, pressing play on player). The role of urlparser is to search direct urls in server pages, without any operation of user. This is a difficult role, because programmers of video servers want users to watch their pages in browser and to reach videos only after ads. So direct urls are obfuscated, sometimes crypted. Good news are that once written the method to get urls from a server, this can be used by all hosts that contains urls toward this server.
Structure
urlparser.py is a very big file in folder IPTVPlayer/libs that is a huge switch statement: a big dict shows all domains that are managed by it.
self.hostMap = {
'lookmovie.ag' : self.pp.parserLOOKMOVIE,
'1fichier.com': self.pp.parser1FICHIERCOM ,
'1tv.ru': self.pp.parser1TVRU ,
'37.220.36.15': self.pp.parserMOONWALKCC ,
'7cast.net': self.pp.parser7CASTNET ,
'abcast.biz': self.pp.parserABCASTBIZ ,
'abcast.net': self.pp.parserABCASTBIZ ,
'aflamyz.com': self.pp.parserAFLAMYZCOM ,
'akvideo.stream': self.pp.parserAKVIDEOSTREAM ,
...............
'yourvideohost.com': self.pp.parserYOURVIDEOHOST ,
'youtu.be': self.pp.parserYOUTUBE ,
'youtube.com': self.pp.parserYOUTUBE ,
'youtube-nocookie.com': self.pp.parserYOUTUBE ,
'youwatch.org': self.pp.parserYOUWATCH ,
'yukons.net': self.pp.parserYUKONS ,
'zalaa.com': self.pp.parserZALAACOM ,
'zerocast.tv': self.pp.parserZEROCASTTV ,
'zstream.to': self.pp.parserZSTREAMTO
}
The second part of couples are names of functions called to find direct urls.
Adding a new server
In previous example we have found this url https://onlystream.tv/e/6pvw6ticrzlg/, but urlparser couldn't recognize it, as shown in log
_________________getHostName: [https://onlystream.tv/e/6pvw6ticrzlg/] -> [onlystream.tv]
urlparser.getParser II try host[onlystream.tv]->host2[tv]
https://onlystream.tv/e/6pvw6ticrzlg/
so if we want to see videos from onlystream, we shall add this server to urlparser. So let's add a line in dict
'onet.pl': self.pp.parserONETTV ,
'onet.tv': self.pp.parserONETTV ,
'onlystream.tv': self.pp.parserONLYSTREAM ,
'openlive.org': self.pp.parserOPENLIVEORG ,
and let's start to code its function
def parserONLYSTREAM(self, baseUrl):
printDBG("parserONLYSTREAM baseUrl[%s]" % baseUrl)
# do something
return []
Let's try to code a function for server "onlystream.tv" Opening a page, we obtain a code similar to 🔗this. Page contains a lot of javascript code and it is typical with this kind of pages, because javascript routines modify dinamically own pages and open new windows for showing us annoying ads. Looking for url, luckily we can found these lines
jwplayer("vplayer").setup({
sources: [{file:"https://t29old.ostreamcdn.com/u5kj6x6qcdhlsdgge7dggmqcjyxf7jc2d3vrlxtncj2jbw524ykihw7ie43q/v.mp4",label:"720p"}],
"logo": {
"file": "//cdn.onlystream.tv/images/player-logo.png",
"link": "https://onlystream.tv",
"hide": "false",
"position": "control-bar"
},
and these ones
tracks: [{file: "https://onlystream.tv/dl?op=get_slides&length=5906.76&url=https://t29old.ostreamcdn.com/i/01/00195/dtf032mrwm3y0000.jpg", kind: "thumbnails"},
{file: "https://onlystream.tv/srt/00324/6pvw6ticrzlg_Serbian.vtt", label: "Serbian", kind: "captions","default": true},
{file: "https://onlystream.tv/srt/empty.srt", label: "Upload SRT", kind: "captions"}]
that contain direct url of stream and an url for subtitles.
Through regular expressions, we can grab useful urls. Final function is:
def parserONLYSTREAM(self, baseUrl):
printDBG("parserONLYSTREAM baseUrl[%s]" % baseUrl)
def checkTxt(txt):
txt = txt.replace('\n', ' ')
if txt.find('file:'):
txt = txt.replace('file:', '"file":')
if txt.find('label:'):
txt = txt.replace('label:', '"label":')
if txt.find('kind:'):
txt = txt.replace('kind:', '"kind":')
return txt
sts, data = self.cm.getPage(baseUrl)
if not sts:
return []
urlsTab=[]
subTracks = []
# subtitles search
t = re.findall("tracks: \[(.*?)\]", data, re.S)
if t:
txt = checkTxt("[" + t[0] + "]")
printDBG(txt)
tracks = json_loads(txt)
printDBG(str(tracks))
for tr in tracks:
if tr.get('kind','') == 'captions':
printDBG(str(tr))
srtUrl = tr.get('file','')
if srtUrl != '' and not ('empty.srt' in srtUrl):
label = tr.get('label', 'srt')
srtFormat = srtUrl[-3:]
params = {'title': label, 'url': srtUrl, 'lang': label.lower()[:3], 'format': srtFormat}
printDBG(str(params))
subTracks.append(params)
# stream search
s = re.findall("sources: \[(.*?)\]", data, re.S)
if not s:
return []
txt = checkTxt("[" + s[0] + "]")
printDBG(txt)
links = json_loads(txt)
#printDBG(str(links))
for l in links:
if 'file' in l:
url = urlparser.decorateUrl(l['file'], {'Referer' : baseUrl, 'external_sub_tracks':subTracks})
params = {'name': l.get('label', 'link') , 'url': url}
printDBG(params)
urlsTab.append(params)
return urlsTab
Now a new server is added to urlparser and we can watch videos in e2iplayer from it!
thx for this original Text to Maxbambi