tetore.blogg.se - Manictime import tags xml

MANICTIME IMPORT TAGS XML CODE

* Actually this solution works with ElementTree, too, which is great if you do not want to depend upon lxml.

MANICTIME IMPORT TAGS XML CODE

Nonetheless, this code is still fragile, since > is a perfectly valid char in XML, even inside attributes.Īnyway, I have to acknowledge that MattH solution is the real, general solution.

Then rsplitting it: > tostring(element).split('>', 1).rsplit('text with data in it.', 'text>\n']Īnd finally getting the first result: > tostring(element).split('>', 1).rsplit('text with data in it.' Get the second resulting string: > tostring(element).split('>', 1) A possible yet still limited solution is to split the string at the first >: > tostring(element).split('>', 1) The solution, of course, is to do everything at once: > tostring(element).replace(''%element.tag, '', 1).replace(''%element.tag, '', -1)ĮDIT: made a good point: this code is fragile since the tag can have attributes. Now, instead of 1, we pass -1 to replace: > tostring(element).replace(''%element.tag, '', -1) Note that str.replace() received 1 as the third parameter, so it will remove only the first occurrence of the opening tag. However, you do not want the external elements, so we can remove them with a simple str.replace() call: > tostring(element).replace(''%element.tag, '', 1) The tostring() function returns a text representation of your element: > tostring(element) That is considerably easy with lxml*, using the parse() and tostring() functions: from lxml.etree import parse, tostringįirst you parse the doc and get your element (I am using XPath, but you can use whatever you want): doc = parse('test.xml') I looked over minidom, etree, lxml and BeautifulSoup but couldn't find a solution for this case (whole content, including inner tags). I lean towards a XML parser based solution. It spans multiple lines: one, two or more.įor now I use regular expressions but it get's kinda messy and I don't like this approach. What I want is the content between the two text tags, including any tags: Some text with data in it. Getting the content in straight cases like title below is easy, but how can I get the whole content between the tags if mixed-content is used and I want to preserve the inner tags? I try to get the whole content between an opening xml tag and it's closing counterpart.