Building Mht Files from URLs revisited

Is it possible to compile this into a dll, for use in vb6, or just convert the code?

Thanks

Any way to apply gzip compression to MHTs?

Great idea, thanks.

Have you considered extending this to save in other widely used formats - like perhaps .doc?

Because .MHT doesn’t seem to be that widely/consistently supported yet, I’m a little hesitant about adopting it as a format for collecting/archiving saved content.

So a version that supported the .doc format would be a welcome alternative (IMHO). I know, it’s not a truly open standard… but the OPen Office folks seem to have managed, so perhaps it’s possible?

This is a great, great tool and will work perfectly for my intranet reporting project. Thanks Jeff!

the program is very cool but it seems i must use the comercial versions

what i need is to convert a local file to mht and send this with vb.net built in mail

Andre

I use Visual Studio 2k5
How create just the simple .mht file? Wtih the headers and etc?

Hi Jeff,

I have posted a question on The Code Project in regards to my question, but I thought it might be easier to reach you here.

Is this app supposed to loop though the entire site folder and create a mht file for each html file? 1 for 1. It seems to only get the first file it finds, creates an mht (perfectly I might add) but stops there.

I have an intranet that has 12000 files, and the reason I am interested in your app is because I need to convert them into mht, and then upload them into Sharepoint. This then allows suers to view all pages in IE Browser, but edit the pages easily in WORD.

So as you can imagine, a batch process is paramount.

Oh and I was a little confused about why it doesn’t keep the orginal file name, but rather taking on the Page Title instead. Why is that.

Thanks for any help you might have, and this page has some good reading

Hi Jeff,

Any chance you can just let me know if your script can convert each file on a website (batch style), and I am just doing it wrong, or it only does one file per run. I have a rather fast approaching deadline to get these files converted.

Everyone seems to think that only IE can open an mht file, which is incorrect. Opera supports it natively. I believe Firefox has an extension for it somewhere if you’re willing to look for it, download it and keep updating it.

The problem that really annoys me, is with IE7 (final version), where saving files as MHTML does NOT guarantee that IE7 itself will actually be able to read them from disk!
If anyone has a work-around (apart from using Firefox, or saving as Complete HTML from IE7), let me know! ildotthomasatiinetdotnetdotau

Hi,
Thank you very much for providing this well crafted code. I need to save html as “Web page complete”, “Web page archive”, “Web page as PDF” for an application to backup blogs. Currently I am doing it using your code . What I am interested in is to show the download progress as the web page is being saved. So I was thinking of combining the functions provided in MHT builder into the extended web browser control found at http://www.codeproject.com/csharp/ExtendedWebBrowser.asp.

I would request your help/guidance/indication/criticism on it.

I noticed in your list of functionality that it now supports iframes. Do iframes cause issues when trying to save in MHT format?
Thanks!

I’ve used your class to make a little application for myself to download and updates episode guides, and it works very nicely except for a couple of urls that only display as text in IE7 once saved as mht files:

http://epguides.com/smallville/

Any ideas?

This works great for most sites, but I have found a few where the mht file displays as text only in IE7.

Here is one example: http://epguides.com/smallville/

Any Ideas?

I have been trying to get your cool mht tool to work with file based urls without much luck (such as file:///C:/App-Dev-2.0/test.htm). I have looked through the code and have added some additional code in the GetUrlData sub in the WebClientEx.vb to account for file based urls:

Dim bFile As Boolean = False
Dim freq As FileWebRequest '= DirectCast(WebRequest.Create(Url), FileWebRequest)
Dim wreq As HttpWebRequest '= DirectCast(WebRequest.Create(Url), HttpWebRequest)

    If LCase(Microsoft.VisualBasic.Left(Url, 8)) = "file:///" Then
        freq = DirectCast(WebRequest.Create(Url), FileWebRequest)
        bFile = True
    Else
        bFile = False
        wreq = DirectCast(WebRequest.Create(Url), HttpWebRequest)
    End If

Everything works fine until it hits the FinializeMht sub in Builder.vb. I get a null exception with Dim sr As New StreamWriter(outputFilePath, False, _HtmlFile.TextEncoding).

Any thoughts?

Thanks so much.

Jef all the methods take the url and then convert to mht. A method that takes a string of html and converts it to mht would be an added bonus as well.

Hello Jeff,
I have seen your article on CodeProject and from there I was redirected to this URL on CodingHorror for downloading the MhtBuilder 2.0. But I don’t see any link on this CodingHorror http://www.codinghorror.com/blog/archives/000249.html to download. Can you pls provide me the link where I can download the code. Also I tried your code available on CodeProject to create the MHT pages. It doesn’t throw an exception and saves the *.mht file but The MHT file will not open in IE7. It shows all the things in plain text format. Is there any issue with MSIE7.0, VS2005, Vista Home Premium. Pls help me.

Thanks in advance.
Nitin

Has anyone resolved itsky’s question. I would like to bundle a web page off of the local disk into an mht file. Every solution I have tried (MS Word, IE, Ms Office 2000 web archive add-in) change the relative paths to full paths (file://C:…). It looks like the images get embedded, but the src is still set to the full path, so the links are broken.

How I can get any part (like body, title, images etc) from MHT-file? Have you any example?

Hi
Just came across the CodeProject article - precisely what I needed. Compiled fine in VS2008. However, I’m getting a couple of problems which I haven’t been able to debug:

  1. It produces an mht file which looks fine in Notepad, but when I open it in IE8 I get a non-descript error message saying the page could not be loaded. Interestingly, when I open an mht file saved by IE, I see the path in the address bar. When I open the generated mht file IE8 rewrites the URL as mht: file://path
  2. Less important - if I set it to use a temporary or permenant file, I get an error saying the path is too long - not really an issue, but thought I’d bring it up while I was writing.
    Thanks for creating such a useful library - I’d appreciate any thoughts on why it doesn’t open in IE8 (running Server 2008 with enhanced security disabled). Could things have changed with newer versions of Windows or VS?
    Saqib