To remove all characters except underscore and hyphen symbols from a text file in Python, you can use regular expressions and the re module. Here's an example code:

import re # Specify the input and output file paths input_file = "input.txt" output_file = "output.txt" # Open the input file for reading and the output file for writing with open(input_file, "r") as infile, open(output_file, "w") as outfile: # Read the input file contents and remove all characters except underscores and hyphens contents = re.sub(r"[^_-]", "", # Write the cleaned contents to the output file outfile.write(contents)

In this code, the regular expression pattern [^_-] matches any character that is not an underscore or a hyphen. The re.sub() function replaces all such characters in the input file contents with an empty string.

Note that this code reads the entire input file into memory at once, which could be a problem for very large files. If you need to process large files, you may want to use a different approach that reads and writes the file in chunks, or uses a memory-mapped file.

Have questions or queries?
Get in Touch