Newsgroups : Borland : borland.public.delphi.internet.winsock : 2006 Apr : Re: Convert HTML to Text

www.cryer.info
Managed Newsgroup Archive

Re: Convert HTML to Text

Subject:Re: Convert HTML to Text
Posted by:"Francois Piette [ICS & Midware]" (francois.piet..@overbyte.be)
Date:Tue, 18 Apr 2006 09:25:41

> Very easy to parse text from HTML using IHTMLDocument2.
>
>
> uses  ...,mshtml, ActiveX, ComObj;
>
> procedure TForm1.Button1Click(Sender: TObject);
> var
>   IDoc:      IHTMLDocument2;
>   sHTMLFile: String;
>   v:         Variant;
> begin
>   sHTMLFile := idHTTP1.Get('http://www.mysite.com');
>   Idoc:=CreateComObject(Class_HTMLDOcument) as IHTMLDocument2;
>   try
>     IDoc.designMode:='on';
>     while IDoc.readyState<>'complete' do
>       Application.ProcessMessages;
>     v:=VarArrayCreate([0,0],VarVariant);
>     v[0]:= sHTMLFile;
>     IDoc.write(PSafeArray(System.TVarData(v).VArray));
>     IDoc.designMode:='off';
>     while IDoc.readyState<>'complete' do
>       Application.ProcessMessages;
>     Memo1.Lines.Text := IDoc.body.innerText;
>   finally
>     IDoc := nil;
>   end;
> end;

Where is the parsing in your code ?
Maybe we don't use the term "parsing" to designate the same thing ?
For me, parsing means analysing the HTML code to extract tags, attributes
values, comments, data and so on.

Contribute to the SSL Effort. Visit http://www.overbyte.be/eng/ssl.html

Replies:

In response to:

www.cryer.info
Managed Newsgroup Archive