I have a very big text file that looks like this as below:
file.txt
rs121334 6546
rs42323 4214
rs254532 5223
. .
. .
rs42323 5223
where the first column is an rs number and the second column is a gene number. I want to write a code that searches for specific gene numbers in file.txt and writes the lines where those specific gene numbers were found in another set.txt file. I have the following code, but it is not working:
dta=open("file.txt","r")
w=open("set.txt","w")
searchgenes=('5223','2645')
for line in dta.readlines():
line=line.split()
for word in searchgenes:
if word in line[1]:
w.write(line)
When I run the code, I get typerror
:
expected a character buffer object.
Any help/suggestions will be appreciated. Thanks!
The following code should work:
dta=open("file.txt","r")
w=open("set.txt","w")
searchgenes=('5223','2645')
for line in dta.readlines():
rs_number, gene_number=line.split()
print(gene_number)
for word in searchgenes:
if word in gene_number:
w.write(line)
dta.close()
w.close()
It's better to avoid the re-use of the line variable and the in- and output files should be closed.
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments