Python实现截取字符串和替换字符串

最近接到老师的一个小任务,就是把txt文档里面每一条记录的歌词下载下来并且以歌曲的ID为文件名称保存。文件格式如下:

2566548|10003|Down And Out In Birmingham|Pirates Of The Mississippi|2502001|Pirates Of The Mississippi|http://musicdata.baidu.com/data2/pic/115488944/115488944.jpg|90984755,3013166,46207696,3899540,||3899540|欧美,英语|1
2566446|10003|I Take My Comfort In You|Pirates Of The Mississippi|2502001|Pirates Of The Mississippi|http://musicdata.baidu.com/data2/pic/115488944/115488944.jpg|50346556,3013160,46176406,3897682,||3897682|欧美,英语|1
2566376|10003|Rollin Home (Pirates)|Pirates Of The Mississippi|2502001|Pirates Of The Mississippi|http://musicdata.baidu.com/data2/pic/115488944/115488944.jpg|89640307,3133916,89640499,3734015,||3734015|欧美,英语|1
2566371|10003|Speak Of The Devil|Pirates Of The Mississippi|2502001|Pirates Of The Mississippi|http://musicdata.baidu.com/data2/pic/115488944/115488944.jpg|49740044,3013161,46175281,3896707,|http://music.baidu.com/data2/lrc/12489233/12489233.lrc|3896707|英语,欧美,关键音,电声乐器,略使用人声合唱|1
2566334|10003|Talkin Bout Love|Pirates Of The Mississippi|2502001|Pirates Of The Mississippi|http://musicdata.baidu.com/data2/pic/115488944/115488944.jpg|89637378,3013164,45176292,3898831,||3898831|欧美,英语|1
2566321|10003|Redneck Rock N Roll|Pirates Of The Mississippi|2502001|Pirates Of The Mississippi|http://musicdata.baidu.com/data2/pic/115488944/115488944.jpg|90828205,3013169,46175679,3897632,||3897632|欧美,英语|1
2566291|10003|Anything Goes|Pirates Of The Mississippi|2502001|Pirates Of The Mississippi|http://musicdata.baidu.com/data2/pic/115488944/115488944.jpg|90959150,3133919,90959245,3735356,|http://music.baidu.com/data2/lrc/12489216/12489216.lrc|3735356|欧美,英语,北美流行|1
2566278|10003|Honky Tonk Blues|Pirates Of The Mississippi|2502001|Pirates Of The Mississippi|http://musicdata.baidu.com/data2/pic/115488944/115488944.jpg|90996181,3133968,46175067,3734602,||3734602|欧美,英语|1
2566264|10003|Feed Jake|Pirates Of The Mississippi|2502001|Pirates Of The Mississippi|http://musicdata.baidu.com/data2/pic/115488944/115488944.jpg|90961516,3736359,46505620,12953128,||12953128|吉他,英语,欧美,原音,关键音|1
2566175|10003|Jolly Roger/Pirates Of The Misissippi|Pirates Of The Mississippi|2502001|Pirates Of The Mississippi|http://musicdata.baidu.com/data2/pic/115488944/115488944.jpg|90891846,3133971,90892078,3734253,||3734253|欧美,英语|1
2202028/70062199ac085441291f#6af6ba5f844fb7b92ea830c1e261d6fa|10003|Dream You (Dance Mix)|Pirates Of The Mississippi|2028188|The Best Of Pirates Of The Mis|http://c.hiphotos.baidu.com/ting/pic/item/6609c93d70cf3bc785905cf9d300baa1cd112a16.jpg|67575556,||67575556||1
2601446|10002|Alright|Pilot Speed|2505781|Into The West|http://musicdata.baidu.com/data2/pic/115530058/115530058.jpg|3146724,45388161,3717336,|http://music.baidu.com/data2/lrc/14892491/14892491.lrc|3717336|alternative pop,独立流行,search,独立摇滚,伤感|60
121415441|10002|Alright|Pilot Speed|121414726|Into The West|http://b.hiphotos.baidu.com/ting/pic/item/10dfa9ec8a1363273fb23bf7938fa0ec09fac793.jpg|122645521,122645530,122645337,||122645337||30

仔细一看,数据格式很规范,直接操作就可以了,但是仔细一观察发现,有的地方的"|"变成了"/",所以现在的工作是把前面的ID后面有的斜杠变成竖杠,然后把歌曲ID提取出来,之后就是把歌曲的歌词URL提取出来。

经过一番折腾,目前实现了结果:

song_lrc

其实逻辑很简单,就是上面说的那样,下面上代码,才疏学浅,只能实现基本功能,无优化

Code   ViewPrint
  1. # -*- coding: utf-8 -*-
  2. import os
  3. import re
  4. for line in open("E:\\song.txt"):
  5.     # line2 = line.replace('/', '|',1)
  6.     # print line
  7.     # fp = open('E://song1.txt','w')
  8.     # fp.write(line2)
  9.     # fp.close()
  10.     matchId = re.match(r'(.*)|',line.replace('/','|'),re.M | re.I)
  11.     if matchId:
  12.         print "歌曲ID: ", matchId.group().split('|',1)[0]
  13.     matchObj = re.search(r'http://music.baidu.com/data2/lrc(.+?).lrc', line, re.M | re.I)
  14.     if matchObj:
  15.         print "歌曲歌词URL: ", matchObj.group()
  16.     else:
  17.         print "LRC is null!!"
  18.     print "<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<"

输出结果就是上面的格式;

莫问

我还没有学会写个人说明!

相关推荐

debian10开启bbr加速

由于 Debian10 默认的内核就是 4.19 版本的内核而且编译了 TCP BBR 模块,所以可以直接通过 ...

3 条评论

  1. 满眼福利,忙不过来了!

发表评论

您的电子邮件地址不会被公开,必填项已用*标注。