Newsgroups : Borland : borland.public.delphi.internet.winsock : 2006 Apr : Re: Convert HTML to Text
| Subject: | Re: Convert HTML to Text |
| Posted by: | "Francois Piette [ICS & Midware]" (francois.piet..@overbyte.be) |
| Date: | Tue, 18 Apr 2006 09:25:41 |
> Very easy to parse text from HTML using IHTMLDocument2.
>
>
> uses ...,mshtml, ActiveX, ComObj;
>
> procedure TForm1.Button1Click(Sender: TObject);
> var
> IDoc: IHTMLDocument2;
> sHTMLFile: String;
> v: Variant;
> begin
> sHTMLFile := idHTTP1.Get('http://www.mysite.com');
> Idoc:=CreateComObject(Class_HTMLDOcument) as IHTMLDocument2;
> try
> IDoc.designMode:='on';
> while IDoc.readyState<>'complete' do
> Application.ProcessMessages;
> v:=VarArrayCreate([0,0],VarVariant);
> v[0]:= sHTMLFile;
> IDoc.write(PSafeArray(System.TVarData(v).VArray));
> IDoc.designMode:='off';
> while IDoc.readyState<>'complete' do
> Application.ProcessMessages;
> Memo1.Lines.Text := IDoc.body.innerText;
> finally
> IDoc := nil;
> end;
> end;
Where is the parsing in your code ?
Maybe we don't use the term "parsing" to designate the same thing ?
For me, parsing means analysing the HTML code to extract tags, attributes
values, comments, data and so on.
Contribute to the SSL Effort. Visit http://www.overbyte.be/eng/ssl.html