domingo, 5 de febrero de 2012

Managing tags from a mp3 with eyeD3

Removing all the tags from a mp3 file


I've been using eyeD3 to write some tags in the mp3 files from a python script. Then I had the problem to remove all the tags from a file. According to the documentation you should do something like:
tag.link("/some/file.mp3")
tag.remove()
tag.update()

but it didn't worked for me so I decided to look at the library code and I found the following solution:
tag = eyeD3.Tag()
tag.link (fileName)
frameList = []
for frame in tag.frames:
frameList.append (frame.header.id)
for l in frameList:
tag.frames.removeFramesByID(l)
tag.update()


Converting an iPod podcast file into regular audio file


The iPod uses some special tags to signal the files that are podcast or audio-books and they're stored and played in a different way. I've used eyeD3 to change files I download from certain podcast and use then as regular audio files.
The first thing is to remove an undocumented tag called PCST, then the genre needs to be changed from podcast to something else. Here is the code:
tag = eyeD3.Tag()
tag.link(fileName)
tag.frames.removeFramesByID("PCST")
g = eyeD3.Genre (None, 'Radio 3')
tag.setGenre (g)
tag.update()


Converting an audio file into an iPod podcast


We only have to invert the process. The content of the PCST tag is expected to be
"00 00 00 04 00 00 00 00 00 00" (*). Here is the code:
tag = eyeD3.Tag()
tag.link (fileName)
frameHeader = eyeD3.FrameHeader()
frameHeader.id = 'PCST'
pcstValue = struct.pack ('BBBBBBBBBB', 0, 0, 0, 4, 0, 0, 0, 0, 0, 0)
f = eyeD3.createFrame(frameHeader, pcstValue, eyeD3.TagHeader())
tag.frames.addFrame(f)

g = eyeD3.Genre (None, 'Podcast')
tag.setGenre (g)
tag.update()

domingo, 22 de enero de 2012

Python's urllib2 and podomatic

I'm creating for myself a small tool to download some podcasts. To do so, I use python and the urllib2 library. Everything went well with a number of sites until yesterday.

I discover this great podcast so I wanted to include it in the tool. The problem is that when I try to download the mp3 file I get a "httpError 403: Forbidden". This puzzles me because the web browser can access to file with no problem.

I started wireshark to look into the requests. I couldn't see any significant difference. So after a few tries I discover the issue was the User Agent field of the header. The library was sending something like:

User-agent: Python-urllib/2.6 /r/n

So I decided to change it. This is the code that does the trick:
 
user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'
headers ={'User-agent': user_agent}
try:
req = urllib2.Request (url, headers=headers)
response = urllib2.urlopen (req)
except urllib2.URLError, e:
print ("Error: %(e)s with url: %(u)s" % {'e':e , 'u':url})

Now the question is why do they configure the server like that?

sábado, 21 de enero de 2012

Downloading a flash stream audio and convert it to a mp3

I've created this python script to download a flash stream audio and then convert it to mp3. My idea was to download a radio program so I can listen to it later on a not-connected portable device. This script can't be used for live streams.

The script uses 3 programs rtmpdump, ffmpeg and lame. rtmpdump is a util to download a stream using rtmp protocol. It can be easily compiled; it depends on libssl and zlib. ffmpeg and lame should be available for any LInux distribution, I compiled them for MAC.

The first step is to download the contents with rtmpdump. It's not a good idea to download everything in one go; it might take a lot of time. I've implemented a multithread mechanism instead. The number of threads and the number of seconds to download can be configured. Each downloaded audio chunk is stored in a file.

The second step is convert all the chunk to pcm raw format and concatenate all of them in a single file. The ffmpeg program is used to do so.

Finally, the raw file is converted to mp3 using lame.

The script uses a considerable amount of disk space for the temporary files. I think the script can be used, with a little modification to download video streams but I haven't tried that.

Here is the code:


#!/usr/bin/env python

import sys
import subprocess
import os
import threading

def executeCommand (cmd):
p1 = subprocess.Popen (cmd, stdout = subprocess.PIPE,
stderr = subprocess.PIPE)
out = p1.communicate()
return out

class ChunkDownloaderThread (threading.Thread):
def __init__ (self, downloader):
threading.Thread.__init__(self)
self._downloader = downloader

def run (self):
exitLoop = False
while not exitLoop:
cmd = self._downloader.getNextThreadCmd ()
if cmd == None:
exitLoop = True
else:
executeCommand (cmd[0])
sys.stdout.write ( str(cmd[1]) + ' ')
sys.stdout.flush ()

class Downloader (object):
_rtmpDumpProgram = 'rtmpdump'
_ffmpegProgram = 'ffmpeg'
_lameProgram= 'lame'
_tmpFile = 'tempFile'
_tmpExtension = '.flv'
_rawExtension = '.raw'

def __init__ (self, chunkSize, noOfThreads):
""" The class constructor.
chunkSize: the size in seconds of each file chunk.
noOfThreads: the number of simultaneaous threads to use. """
self._chunkSize = chunkSize
self._noOfThreads = noOfThreads



def cleanTempFiles (self):
extensions = [self._tmpExtension, self._rawExtension ]
for e in extensions:
fileName = self._tmpFile + e
if os.path.isfile (fileName):
os.remove (fileName)

index = 0
exitLoop = False
while not exitLoop:
fileName = self._tmpFile + str(index) + self._tmpExtension
if os.path.isfile (fileName):
os.remove (fileName)
else:
exitLoop = True
index += 1



def prepareDownloadCmd (self, url, destinationFile):
cmd = [self._rtmpDumpProgram]
cmd += ['-r', url, '-o', destinationFile]
return cmd

def findDuration (self, url):
duration = 0.0
fileName = self._tmpFile + self._tmpExtension
cmd = self.prepareDownloadCmd (url, fileName)

# Download just one second
cmd += ['-B', '1']
output = executeCommand(cmd)[1]
output = output.split ('\n')
for line in output:
durationStr = 'duration'
infoStr = 'INFO:'
pos = line.find (infoStr)
if pos > -1:
pos = line.find (durationStr)
if pos > -1:
duration = line[pos + len(durationStr):].strip()
duration = float(duration)
break
self.cleanTempFiles()
return duration

def downloadChunk (self, url, tempFile, firstSecond=0.0, lastSecond=0.0):
cmd = self.prepareDownloadCmd (url, tempFile)
if firstSecond != 0.0:
cmd += ['-A', str(firstSecond)]
if lastSecond != 0.0:
cmd += ['-B', str(lastSecond)]
return cmd

def getNextThreadCmd (self):
retVal = None
self.theLock.acquire (True)
if self.currentChunk < self.totalChunks:
auxIndex = self.currentChunk
cmd = self.downloadChunk (self.url, self.chunkList[auxIndex][2],
self.chunkList[auxIndex][0],
self.chunkList[auxIndex][1])
retVal = [cmd, auxIndex]
self.currentChunk += 1

self.theLock.release ()

return retVal

def downloadFile (self, url):
duration = self.findDuration (url)

self.chunkList = []
chSize = float (self._chunkSize)
begin = 0.0
end = chSize
index = 0
while (begin < duration):
if end >= duration:
end = 0.0
fileName = self._tmpFile + str(index) + self._tmpExtension
self.chunkList.append ([begin, end, fileName])
begin += chSize
end += chSize
index += 1

self.totalChunks = index
self.currentChunk = 0
self.url = url
self.theLock = threading.Lock()
index = 0

sys.stdout.write ('Total ' + str(self.totalChunks) + '\n')
sys.stdout.flush ()

threads = []
for i in range (0, self._noOfThreads):
t = ChunkDownloaderThread (self)
threads.append (t)

for t in threads:
t.start()
for t in threads:
t.join()

sys.stdout.write ('\n')
sys.stdout.flush ()
return self.totalChunks

def concatenateFile (self, totalChunks):
# Flv to raw command
cmd = [self._ffmpegProgram, '-i']
completeCmd = ['-vn', '-f', 'u16le', '-acodec', 'pcm_s16le',
'-ac', '2', '-ab', '128k', '-ar', '44100', '-']#, '<', '/dev/null']
tempRawFile = self._tmpFile + self._rawExtension
f = open (tempRawFile, "wb")
sys.stdout.write ('Concatenate: \n')
sys.stdout.flush ()
for i in range (0, totalChunks):
flvFile = self._tmpFile + str(i) + self._tmpExtension
toExe = cmd + [flvFile] + completeCmd

output = executeCommand (toExe)
f.write (output[0])
sys.stdout.write (str (i) + ' ')
sys.stdout.flush ()
# Delete the chunk to save disk space
os.remove (flvFile)
sys.stdout.write ('\n')
sys.stdout.flush ()

f.close ()


def convertToMp3 (self, destination):
tempRawFile = self._tmpFile + self._rawExtension
cmd = [self._lameProgram, '-r', '-s', '44.1', '--preset', 'cd',
tempRawFile, destination]
sys.stdout.write ('Converting to mp3\n')
sys.stdout.flush ()
executeCommand (cmd)
sys.stdout.write ('Done\n')
sys.stdout.flush ()

def downloadAndConvertFile (self, url, destination):
totalChunks = self.downloadFile (url)
self.concatenateFile (totalChunks)
self.convertToMp3 (destination)
self.cleanTempFiles ()


if __name__ == '__main__':
D = Downloader(60, 15)

D.downloadAndConvertFile (sys.argv[1], sys.argv[2])