I'm trying to make an MP3 + Lyric -> MP4 program in python.
I have a lyrics file like this:
[00:00.60]Revelation, chapter 4
[00:02.34]After these things I looked,
[00:04.10]and behold a door was opened in heaven,
[00:06.41]and the first voice which I heard, as it were,
[00:08.78]of a trumpet speaking with me, said:
[00:11.09]Come up hither,
[00:12.16]and I will shew thee the things which must be done hereafter.
[00:15.78]And immediately I was in the spirit:
[00:18.03]and behold there was a throne set in heaven,
[00:20.72]and upon the throne one sitting.
[00:22.85]And he that sat,
[00:23.91]was to the sight like the jasper and the sardine stone;
[00:26.97]and there was a rainbow round about the throne,
[00:29.16]in sight like unto an emerald.
[00:31.35]And round about the throne were four and twenty seats;
[00:34.85]and upon the seats, four and twenty ancients sitting,
[00:38.03]clothed in white garments, and on their heads were crowns of gold.
[00:41.97]And from the throne proceeded lightnings, and voices, and thunders;
[00:46.03]and there were seven lamps burning before the throne,
[00:48.60]which are the seven spirits of God.
[00:51.23]And in the sight of the throne was, as it were,
[00:53.79]a sea of glass like to crystal;
[00:56.16]and in the midst of the throne, and round about the throne,
[00:59.29]were four living creatures, full of eyes before and behind.
[01:03.79]And the first living creature was like a lion:
I'm trying to create a sequence of images from the lyrics to use into ffmpeg.
os.system(ffmpeg_path + " -r 2 -i " + images_path + "image%1d.png -i " + audio_file + " -vcodec mpeg4 -y " + video_name)
I tried finding out the number of images to make for each line. I've tried subtracting the seconds of the next line from the current line. It works but produces very inconsistent results.
import os
import datetime
import time
import math
from PIL import Image, ImageDraw
ffmpeg_path = os.getcwd() + "\\ffmpeg\\bin\\ffmpeg.exe"
images_path = os.getcwd() + "\\test_output\\"
audio_file = os.getcwd() + "\\audio.mp3"
lyric_file = os.getcwd() + "\\lyric.lrc"
video_name = "movie.mp4"
def save():
lyric_to_images()
os.system(ffmpeg_path + " -r 2 -i " + images_path + "image%1d.png -i " + audio_file + " -vcodec mpeg4 -y " + video_name)
def lyric_to_images():
file = open(lyric_file, "r")
data = file.readlines()
startOfLyric = True
lstTimestamp = []
images_to_make = 0
from_second = 0.0
to_second = 0.0
for line in data:
vTime = line[1:9] # 00:00.60
temp = vTime.split(':')
minute = float(temp[0])
#a = float(temp[1].split('.'))
#second = float((minute * 60) + int(a[0]))
second = (minute * 60) + float(temp[1])
lstTimestamp.append(second)
counter = 1
for i, second in enumerate(lstTimestamp):
if startOfLyric is True:
startOfLyric = False
#first line is always 3 seconds (images to make = 3x2)
for x in range(1, 7):
writeImage(data[i][10:], 'image' + str(counter))
counter += 1
else:
from_second = lstTimestamp[i-1]
to_second = second
difference = to_second - from_second
images_to_make = int(difference * 2)
for x in range(1, int(images_to_make+1)):
writeImage(data[i-1][10:], 'image'+str(counter))
counter += 1
file.close()
def writeImage(v_text, filename):
img = Image.new('RGB', (480, 320), color = (73, 109, 137))
d = ImageDraw.Draw(img)
d.text((10,10), v_text, fill=(255,255,0))
img.save(os.getcwd() + "\\test_output\\" + filename + ".png")
save()
Is there any efficient and accurate way to calculate how many images I need to create for each line?
Note: Whatever many images I create will have to be multiplied by 2 because I'm using -r 2
for FFmpeg (2 FPS).
Use subtitles with the subtitles filter. This will be easier and more efficient than making images beforehand and trying to time everything. You can also control the font, size, color, style, position, etc. Example using the color filter as the background:
ffmpeg -i music.mp3 -filter_complex "color=c=blue,subtitles=lyrics.srt[v]" -map "[v]" -map 0:a -c:a aac -shortest output.mp4
This is a simple format that supports basic styling.
1
00:00:00,600 --> 00:00:02,340
Revelation, chapter 4
2
00:00:02,340 --> 00:00:04,100
<b>After</b> these <u>things</u> I <font color="green">looked</font>,
3
00:00:04,100 --> 00:00:06,410
and behold a door was opened in heaven,
With ASS subtitles you can get even more control, such as individual word and letter styling, but this format is much more complicated:
[Script Info]
ScriptType: v4.00+
PlayResX: 384
PlayResY: 288
[V4+ Styles]
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding
Style: Default,Arial,16,&Hffffff,&Hffffff,&H0,&H0,0,0,0,0,100,100,0,0,1,1,0,2,10,10,10,0
[Events]
Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
Dialogue: 0,0:00:00.60,0:00:02.34,Default,,0,0,0,,Revelation, chapter 4
Dialogue: 0,0:00:02.34,0:00:04.10,Default,,0,0,0,,After these things I looked,
Dialogue: 0,0:00:04.10,0:00:06.41,Default,,0,0,0,,and behold a door was opened in heaven,
This example only shows the format structure: I didn't add any styling. Aegisub can be used to create ASS subtitles if you want to experiment with this format. ffmpeg
can convert subtitle formats.
force_style
optionThe force_style
option in the subtitles filter can extend the formatting possibilities of the simplistic SRT format. It uses the ASS format options such as Fontsize
, Fontname
, OutlineColour
, etc. Look at the Format
line in the ASS example above for a list of options.
subtitles=lyrics.srt:force_style='Fontname=DejaVu Serif,PrimaryColour=&HCCFF0000'
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments