有人可以帮我解决这个问题。我在文本文件中设置了一组问题,我想提取多个正则表达式模式之间的可用内容并将其写入文本或csv文件。
我想将问题的内容(底部给出的测试数据)添加到带有','分隔符的新文本/ csv文件中,其中文件内容输出的第一行如下所示
以下哪个,快速处理器……,它们必须是双...,类似的RAM ...,快速网络...,B,双宿主...
输出文本/ csv文件的下一行应包含以下哪项:..,。Micro,Warm,Trojan,Virus,B,Dual-homeed或double-homing可以引用……。 。
注意:上面提到的ignore ....,用于表示之前的剩余内容。问题2中的信息,分开。
我想使用正则表达式和循环作为问题号,多项选择选项,答案和解释字段来实现此目的,就像提取数据的主要实体一样。
测试数据:问题编号:1
以下哪一项是IDS / IPS系统或代理服务器必须具有的硬件要求才能正常运行?
A.快速处理器,有助于进行网络流量分析
B.它们必须是双宿主的
C.类似的RAM要求
D.快速网络接口卡
答案:B说明:
双宿主或双宿主可以指的是具有多个网络接口的以太网设备(出于冗余目的),或者在防火墙技术中,双宿主是防火墙体系结构之一,例如IDS / IPS系统。实施预防性安全。
问题2
以下哪个是需要主机应用程序进行复制的应用程序?
A.微
蠕虫
C.特洛伊木马
D.
病毒
答案:D说明:
计算机病毒会感染其主机上的各种不同子系统。计算机病毒是一种恶意软件,在执行时会通过自我复制或通过修改来感染其他程序来进行复制。感染计算机程序也可能包括数据文件或硬盘驱动器的启动扇区。当复制成功时,然后将受影响的区域称为“已感染”。
import re
inFile = open("input_new.txt",encoding='utf-8')
outFile = open("result.txt", "w")
buffer1 = ""
keepCurrentSet = True
for line in inFile:
buffer1=buffer1+(line)
buffer1=re.findall(r"(?<=QUESTION NO:\s\d\s*) (.*?) (?=A\.)", buffer1)
outFile.write("".join(buffer1))
inFile.close()
outFile.close()
由于您拥有的问题答案文本文件是结构化的文件,因此您可以采用以下方法将单个问题答案数据转换为csv行,
import re
text = """Which of the following is a hardware requirement that either an IDS/IPS system or a proxy server must have in order to properly function?
A. Fast processor to help with network traffic analysis
B. They must be dual-homed
C. Similar RAM requirements
D. Fast network interface cards
Answer: B Explanation:
Dual-homed or dual-homing can refer to either an Ethernet device that has more than one network interface, for redundancy purposes, or in firewall technology, dual-homed is one of the firewall architectures, such as an IDS/IPS system, for implementing preventive security."""
text1 = re.sub(r'(\n+A\.)|(\n+B\.)|(\n+C\.)|(\n+D\.)', ',', text)
text1 = re.sub(r'\n+Answer:|Explanation:\n+', ',', text1)
print(text1)
最终输出:
Which of the following is a hardware requirement that either an IDS/IPS system or a proxy server must have in order to properly functi
on?, Fast processor to help with network traffic analysis, They must be dual-homed, Similar RAM requirements, Fast network interface c
ards, B ,Dual-homed or dual-homing can refer to either an Ethernet device that has more than one network interface, for redundancy pur
poses, or in firewall technology, dual-homed is one of the firewall architectures, such as an IDS/IPS system, for implementing prevent
ive security.
如您所见,该text
变量保存单个问题答案数据(使用换行符,如果您使用python从文件中读取该数据,则会得到换行符)。
接下来,我要做的是
text1 = re.sub(r'(\n+A\.)|(\n+B\.)|(\n+C\.)|(\n+D\.)', ',', text)
在\n+A\.
例如匹配的文本,如“A”。在多项选择答案中(\n+
匹配多项选择选项之前的所有换行符)。使用上面的代码,我们将MCQ选项标签转换为,
。
在最后一步,我这样做
text1 = re.sub(r'\n+Answer:|Explanation:\n+', ',', text1)
它照顾实际答案及其解释。
您可以通过对文件中的每个问题循环以上逻辑以得出所需的输出,自己进行推断。
编辑:
您还可以使用以下代码一次处理多个问题:
import re
text = """QUESTION NO: 1
Which of the following is a hardware requirement that either an IDS/IPS system or a proxy server must have in order to properly function?
A. Fast processor to help with network traffic analysis
B. They must be dual-homed
C. Similar RAM requirements
D. Fast network interface cards
Answer: B Explanation:
Dual-homed or dual-homing can refer to either an Ethernet device that has more than one network interface, for redundancy purposes, or in firewall technology, dual-homed is one of the firewall architectures, such as an IDS/IPS system, for implementing preventive security.
QUESTION NO: 2
Which of the following is an application that requires a host application for replication?
A. Micro
B. Worm
C. Trojan
D.
Virus
Answer: D Explanation:
Computer viruses infect a variety of different subsystems on their hosts. A computer virus is a malware that, when executed, replicates by reproducing itself or infecting other programs by modifying them. Infecting computer programs can include as well, data files, or the boot sector of the hard drive. When this replication succeeds, the affected areas are then said to be "infected"."""
text1 = re.sub(r'QUESTION NO: \d+', '\n', text)
text1 = re.sub(r'(\n+A\.)|(\n+B\.)|(\n+C\.)|(\n+D\.)', ',', text1)
text1 = re.sub(r'\n+Answer:|Explanation:\n+', ',', text1)
text1 = re.sub(r'\n+', '\n', text1)
print(text1)
这将打印以下内容,
Which of the following is a hardware requirement that either an IDS/IPS system or a proxy server must have in order to properly functi
on?, Fast processor to help with network traffic analysis, They must be dual-homed, Similar RAM requirements, Fast network interface c
ards, B ,Dual-homed or dual-homing can refer to either an Ethernet device that has more than one network interface, for redundancy pur
poses, or in firewall technology, dual-homed is one of the firewall architectures, such as an IDS/IPS system, for implementing prevent
ive security.
Which of the following is an application that requires a host application for replication?, Micro, Worm, Trojan,
Virus, D ,Computer viruses infect a variety of different subsystems on their hosts. A computer virus is a malware that, when executed,
replicates by reproducing itself or infecting other programs by modifying them. Infecting computer programs can include as well, data
files, or the boot sector of the hard drive. When this replication succeeds, the affected areas are then said to be "infected".
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句