Tech Support Guy banner
  • Please post in our Community Feedback thread for help with the new forum software! If you are having trouble logging in, please Contact Us for assistance.
Status
Not open for further replies.
1 - 20 of 26 Posts

·
Registered
Joined
·
13 Posts
Discussion Starter · #1 ·
I have a couple of million .html files to change, so I need a batch way of doing them all at once. They change seems simple but it is turning out to be very difficult to actually get done right. Here is the situation. The files currently have this content:

123 Product XYZ

What I want is to end up with this content:

123 Product XYZ

Can't figure out any way to do it. Please note that although I've been around computers since they were hacked from granite, I am not a programmer so I'm stuck. Any assistance would be appreciated! Happy holidays! :)
 

·
Super Moderator
Joined
·
79,313 Posts
let me know if you don't get a response in the next 24 hours or so........if not, I'll flag down one of our resident coders. I'm also going to move this to business apps.

thanks,

v
 

·
Registered
Joined
·
9,026 Posts
In the html files you want to read the product name from the title tags, and add it to the end of the description line?

Take the part in red and add it as shown:
123 Product XYZ

123 Product XYZ

Are these two lines always one right after the other, or could there be intervening lines?
Is there a following line that will always be present?
Can you zip up and attach one or two of the files, assuming it won't violate any privacy concerns?

To attach a file:
  • If using Quick Reply, click the Go Advanced button under the Reply box..
  • Click the Paperclip
    at the top of the editor window, or scroll down and click the Manage Attachments button in the Additional Options section (may have to expand it).
  • Click the Browse... button and browse to your file
  • Click the Upload button
  • Repeat for any more files, then close the Manage Attachments window
 

·
Super Moderator
Joined
·
79,313 Posts
Thanks, TheOutcaste. :)
 

·
Registered
Joined
·
13 Posts
Discussion Starter · #6 ·
Thanks for the prompt reply. I'm attaching a simplified version of the html in the attachment. The section is exactly as it is in the actual megafiles. You'll see that the two lines are attached and that is the way they are always present, with no intervening lines or characters. The next characters are also always present as is. I really appreciate your help! Thanks! :)
 

Attachments

·
Registered
Joined
·
9,026 Posts
Couple more questions:The sample has this:
123 Product XYZ
There is a leading and trailing space there, do you want them removed when it's added to the description, or left as is?
And this part:
content="Blah Blah Blah."

is that ending period always present, and do you want the Product name before the period, or after the period?
 

·
Registered
Joined
·
13 Posts
Discussion Starter · #8 ·
We can just leave the spaces as they are and we can also leave the period where it is at the end of the Blah, so the product comes after the period, naturally with a space. Thanks again for your continued interest! :)
 

·
Registered
Joined
·
9,026 Posts
That makes it easier.
This will find all html files in the source folder, make the change, and write the new file in the destination folder.
The new file will have the same name and relative path. The original file will be unchanged.
Very easy to set it to overwrite the current file, but this is a bit safer for testing.
It is recursive, so will find all files in the Source folder and it's subfolders.
It will output every 5th file name to the title bar as a progress indicator. This does slow it down a bit, so you may want to increase that to a higher number (edit the number in blue), or it can be removed entirely.
Edit the two lines in red to point to the Source and Destination folder. The Destination folders will be created if they don't exist.
Do not set the source and destination to the same folder, it won't work (it will delete all the files). The program checks for that.

Give this a try with a few test files first of course.

Copy the text in the following code block into Notepad.
Save this someplace with a .cmd extension.
Be sure to change the Save as Type: box to All Files when saving.
Double click it to run the file.

Code:
@Echo Off
SetLocal EnableDelayedExpansion
[COLOR=Red]Set _Source=C:\Test Source
Set _Dest=C:\Test Dest[/COLOR]
If /I "%_Source%"=="%_Dest%" Echo.Error. Source and Destination are the Same&Pause&Goto :EOF
Set _VBFile=%temp%\SNR.vbs
Call :_MakeVBS
Set _Count=1
For /F "Tokens=* Delims=" %%I In ('Dir /A-D /B /S "%_Source%\*.html"') Do (
  Set _OutPath=%%~dpI
  Set _OutPath=!_OutPath:%_Source%=%_Dest%!
  Set /A _Count-=1
  If !_Count!==0 Title Processing %%I & Set _Count=[COLOR=Blue][B]5[/B][/COLOR]
  If Not Exist "!_OutPath!" md "!_OutPath!"
  cscript /nologo "%_VBFile%" "%%I" "!_OutPath!\%%~nxI"
)
Del "%_VBFile%"
Goto :EOF
:::::::::::::::::::::::::::::::::::::::::::::::::::
:: Make VBS File
:::::::::::::::::::::::::::::::::::::::::::::::::::
:_MakeVBS
(Echo.Const ForReading = 1
Echo.Const ForWriting = 2
Echo.
Echo.StrFileName = Wscript.Arguments.Item^(0^)
Echo.StrOutfName = Wscript.Arguments.Item^(1^)
Echo.Set objFSO = CreateObject^("Scripting.FileSystemObject"^)
Echo.' Delete output file if it exists
Echo.If objFSO.FileExists^(StrOutfName^) Then objFSO.DeleteFile^(StrOutfName^)
Echo.
Echo.' Set search pattern
Echo.Set objRegEx = CreateObject^("VBScript.RegExp"^)
Echo.objRegEx.IgnoreCase = True
Echo.objRegEx.Global = True
Echo.objRegEx.Pattern = "<meta name=""description"" content=(.+)"">(.*)<title>(\s*)(.+)</title>"
Echo.strRepPatrn = "<meta name=""description"" content=$1 $4"">$2<title>$3$4</title>"
Echo.' Open a file
Echo.Set objFile = objFSO.OpenTextFile^(StrFileName,ForReading^)
Echo.strContents = objFile.ReadAll
Echo.objFile.Close
Echo.strNewStr = objRegEx.Replace^(strContents, strRepPatrn^)
Echo.Set objOutputFile = objFSO.CreateTextFile^(StrOutfName^)
Echo.objOutputFile.WriteLine strNewStr
Echo.objOutputFile.Close)>"%_VBFile%"
 

·
Super Moderator
Joined
·
79,313 Posts
TheOutcaste, mind if ask how you picked this up? Self-taught, picked up over a period of experimentation, or is there some bible out there that you refer to?

You continue to amaze, and I'll leave it at that.

cheers, and happy new year.
 

·
Super Moderator
Joined
·
79,313 Posts
Guy's something else, ain't he? :)
 

·
Registered
Joined
·
9,026 Posts

·
Registered
Joined
·
13 Posts
Discussion Starter · #15 ·
Awesome! I'm really looking forward to testing t the livin' daylights out of it in the morning and I'm pretty sure that there will be no ugly old 1970s AMC compact cars in there! :)
 

·
Retired Trusted Advisor
Joined
·
5,465 Posts
Outcaste, Small mistake, big problem. You have no line break between
If /I "%_Source%"=="%_Dest%" Echo.Error. Source and Destination are the Same&Pause&Goto :EOF
and
Set _VBFile=%temp%\SNR.vbs

The result is that as is the script will a) not work and b) delete everything from the folder it's executed in. As is default it does ask you for confirmation before the wildcard delete, but with echo off you don't know what it's asking you to confirm (you just get the Y/N line.)

demimetacalf, I hope that you read this before you test the livin daylights out of it.
The corrected code you need is the following.
Code:
@Echo Off
SetLocal EnableDelayedExpansion
[COLOR=Red]Set _Source=C:\Test Source
Set _Dest=C:\Test Dest
[/COLOR]If /I "%_Source%"=="%_Dest%" Echo.Error. Source and Destination are the Same&Pause&Goto :EOF
Set _VBFile=%temp%\SNR.vbs
Call :_MakeVBS
Set _Count=1
For /F "Tokens=* Delims=" %%I In ('Dir /A-d /B /S *.html') Do (
  Set _OutPath=%%~dpI
  Set _OutPath=!_OutPath:%_Source%=%_Dest%!
  Set /A _Count-=1
  If !_Count!==0 Title Processing %%I & Set _Count=5
  If Not Exist "!_OutPath!" md "!_OutPath!"
  cscript /nologo "%_VBFile%" "%%I" "!_OutPath!\%%~nxI"
)
Del "%_VBFile%"
Goto :EOF
:::::::::::::::::::::::::::::::::::::::::::::::::::
:: Make VBS File
:::::::::::::::::::::::::::::::::::::::::::::::::::
:_MakeVBS
(Echo.Const ForReading = 1
Echo.Const ForWriting = 2
Echo.
Echo.StrFileName = Wscript.Arguments.Item^(0^)
Echo.StrOutfName = Wscript.Arguments.Item^(1^)
Echo.Set objFSO = CreateObject^("Scripting.FileSystemObject"^)
Echo.' Delete output file if it exists
Echo.If objFSO.FileExists^(StrOutfName^) Then objFSO.DeleteFile^(StrOutfName^)
Echo.
Echo.' Set search pattern
Echo.Set objRegEx = CreateObject^("VBScript.RegExp"^)
Echo.objRegEx.IgnoreCase = True
Echo.objRegEx.Global = True
Echo.objRegEx.Pattern = "<meta name=""description"" content=(.+)"">(.*)<title>(\s*)(.+)</title>"
Echo.strRepPatrn = "<meta name=""description"" content=$1 $4"">$2<title>$3$4</title>"
Echo.' Open a file
Echo.Set objFile = objFSO.OpenTextFile^(StrFileName,ForReading^)
Echo.strContents = objFile.ReadAll
Echo.objFile.Close
Echo.strNewStr = objRegEx.Replace^(strContents, strRepPatrn^)
Echo.Set objOutputFile = objFSO.CreateTextFile^(StrOutfName^)
Echo.objOutputFile.WriteLine strNewStr
Echo.objOutputFile.Close)>"%_VBFile%"
I've just finished, I think, writing a 250 line VBS script for doing such HTML editing so I'll be watching this thread with eagerness to see what I might have done differently. Regex is the first thing I didn't think of.
 

·
Super Moderator
Joined
·
79,313 Posts
Self-taught and picked up over a (long) period of experimentation, and Google.
The VBScript Language Reference is quite useful, as is the Script Center, w3schools, and DevGuru.
thanks, mate.......got some reading in front of me. Not to hijack the thread, but I've recently been put in quasi-charge of a server farm, and I need to learn some scripts to automate clearing of logs.......mine are nice and simple, but require manual activation. Need to work on that, and I reckon that the pointers you've pointed me to will take care of that.

That said, if you see a largish mushroom cloud to the east, you know it didn't work.
 

·
Registered
Joined
·
9,026 Posts
Good catch Ent!:up::up:
My fat fingers must have missed the Enter key when I typed that line in, and I didn't notice.:eek::eek:
I've fixed it in my post, and do appreciate the proof-reading.

I never knew Del "" would be seen as Del *, that's a big bug I think. It should always give a file not found error in my opinion.

Just goes to show you should always test with test data.
 

·
Registered
Joined
·
13 Posts
Discussion Starter · #20 ·
WHOOOOOOOOOOHOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOoooooooooooooooooo :D

We got gremlins, but we also have success!

I'll start at the beginning.

I created two folders and placed them both in C: root.

arc02test1 had 1900 html files as a test, with no subfolders.

testdest1 was an empty folder

I ran the vbs1.cmd file

Within about a minute I had 1900 html files in testdest1 which were absolutely flawless and perfect! Wheeee! :up:

Unfortunately, the VBS kept running. It started going through some folders in my applications such as Photoshop, then it went into the Recycle Bin and showed continually scrolling failures, all with the same SNR.vbs(17,1) error. I'm attaching a screenshot.

I terminated the VBS, created a new vbs1.cmd file, reset everything with arc02test2 and testdest2, but now all it does when I run it is go right to the Recycle Bin and error out.

So it worked once! So all we have to do is run a steamroller over a few AMC gremlins and we'll be THERE!

I have no words to thank all of you. This is aaaaaaaaaaaaaaaaaaaaaaaaaamazing! :)
 

Attachments

1 - 20 of 26 Posts
Status
Not open for further replies.
Top