leftultimate.blogg.se

Python remove html tags from string
Python remove html tags from string







python remove html tags from string

your_findall_output, assuming you’ve assigned the output of findall to a variable of that name). “\n\n” for two line breaks) as the object you’re calling it on, and the list of strings you want to join into one as the argument (e.g. join() method of strings with the seperator string (e.g. The simplest solution is to just seperate them with linebreaks, and is fine if you don’t need to process them further individually (if you do, you’d want to consider a different character that didn’t appear in the data). You need to decide, based on the parameters of your assignment, if and how you want to separate the strings in your output file. f2.write(your_output_string_here).įurthermore, was also correct in inferring that you need to pass a string to the. write() method with the string you want to write, i.e. write attribute (in this case, a method) of the file. To read the file contents from the file, you need to use f.read() (preferably inside a with block, as above).Īs also mentioned, what you’re doing above assigns the result of regex to the.

python remove html tags from string

Here’s where your problems are-this isn’t doing what you probably think it is, for multiple reasons:Īs implied, str(f) does not give you the file contents as a string rather, it gives you information about the file object. *f2.write = re.findall("(.*?)", str(f)) # select all text between html tags and save to txt file*









Python remove html tags from string