.net - Library for parsing XHTML files with XLINQ -


When I realized that I need to create an index for about 50 xHTML pages, which can be added / Future, I thought, "No problem - I'll write a quick index generator for XML using LINQ, because XHMTL is definitely counted as XML".

Of course, as soon as I tried to run it, I came to know about the fact that XLINQ found that the XHTML units like & amp; Nbsp; Suppressed. I got it using the following algorithm:

  1. Read the XHTML file in the string.
  2. Use regex search and replace the string that defines DOCTYPE, all relevant entities (because I only care about the "title" attribute in those files which I have read and my output The file does not use any unit right now, it just sets them to empty, but I can add real values ​​later). / Li>
  3. Parse results in XDocument.

To save a file, I do the opposite:

  1. Save XDocument to a string
  2. Unit Definitions Eject out
  3. Save in the file.

My question is, is there any library (especially built-in .net which I can read XHTML files in XDocuments? I wrote the code to fulfill its purpose To generate the current index and to test the rest generator program), and I really would not like to spend time to test it if someone else already wrote and tested Thank you so much for your time.
Riya.

Edit: Thank you very much! This work! I still have a little C string processing is done when I have XHTML (assumed that the library was not actually done for that :)) and I had to bargain with the source of the agility pack so that it could insert the CDATA section inside each style feature (even That one already existed), but this op No source means, okay?

it can be helpful:


Comments