Searching \ for '[EE]: How to find the changes in a new version of' in subject line. ()
Make payments with PayPal - it's fast, free and secure! Help us get a faster server
FAQ page: www.piclist.com/techref/index.htm?key=how+find+changes
Search entire site for: 'How to find the changes in a new version of'.

Exact match. Not showing close matches.
PICList Thread
'[EE]: How to find the changes in a new version of '
2001\03\03@143559 by James Michael Newton

picon face
If I can figure out how to find the changes in a new version of a file and
present them in a human readable format, I'll be able to start offering
people a more complete version of the change log that is currently being
generated at the site.

Take a look at
www.piclist.com/techref/microchip/new200102.txt
or
http://www.piclist.com/techref/microchip/new200103.txt

The idea is that people could find only the items that have been changed
over the last month. This could be a link from the site or it could be
emailed to people who want it on a monthly basis. A digest of site updates.

The problem is when a page is edited. I have the old file, and the new file,
and I don't want to just paste the entire new file into the change log but I
also don't want to just put "the page was edited" in the change log and
force people to go to the page and try to guess what is different.

I've looked at the diff program
www.openbsd.org/cgi-bin/cvsweb/src/gnu/usr.bin/diff/
and at this algo
http://www.cs.sunysb.edu/~algorith/files/longest-common-substring.shtml

But they both seem to be based on the idea of creating a file for a machine
to use and / or they are very, very complex. I need something simpler. It
doesn't have to be perfect, but it needs to produce an output useful to
human readers.

so far I have something like the following (in VBScript)

Function boolComp(aryA, intA, aryB, intB)
 boolComp = false
 if intA <= ubound(aryA) and intB <= ubound(aryB) then
   if aryA(intA) = aryB(intB) then boolComp = true
   end if
 end function

'snip, figure out what the files are
'and read the new one into strFile
'and the old one in to strFileBak

       aryFile = split(replace(strFile, ">", "><"),"<")
       aryFileBak = split(replace(strFileBak, ">", "><"),"<")

'this splits the file into an array by HTML tags and text.
'The array elements with a ">" are the tags
'and the ones without are text. But the point
'was just to provide a consistant way to split
'it up into manageable chunks. If one word in a paragraph
'has changed, its ok to show the entire paragraph. Also,
'I can choose to say that if the only changes were to HTML
'tags, the log can say "was edited for format" and nothing else.

       intI = 1
       intJ = 1
       while intI <= ubound(aryFile) or intJ <= ubound(aryFileBak)
         if boolComp(aryFile, intI, aryFileBak, intJ) then
           aryFile(intI) = ""
           aryFileBak(intJ) = ""
           intI = intI + 1
           intJ = intJ + 1
         elseif boolComp(aryFile, intI+1, aryFileBak, intJ+1) then
           intI = intI + 1
           intJ = intJ + 1
         else
           for intDist = 1 to Max(ubound(aryFile),ubound(aryFileBak))
             if boolComp(aryFile, intI+intDist, aryFileBak, intJ) then
               intI = intI+intDist
               aryFile(intI) = ""
               aryFileBak(intJ) = ""
               exit for
               end if
             if boolComp(aryFile, intI, aryFileBak ,intJ + intDist) then
               intJ = intJ+intDist
               aryFile(intI) = ""
               aryFileBak(intJ) = ""
               exit for
               end if
             next
           intI = intI + 1
           intJ = intJ + 1
           end if
         wend

But this doesn't seem to work well. It tends to re-syncronize in incorrectly
and leave unchanged parts of the file in the arrays.

Does anyone have a simple algorithm for doing this? ...a pointer to one?
...an idea where I might find a pointer to one? ...an idea why this isn't
totally clear and solvable by me on the first glance?

James Newton, PICList Admin #3
spam_OUTjamesnewtonTakeThisOuTspampiclist.com
1-619-652-0593 phone
http://www.piclist.com


_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


--
http://www.piclist.com hint: PICList Posts must start with ONE topic:
[PIC]:,[SX]:,[AVR]: ->uP ONLY! [EE]:,[OT]: ->Other [BUY]:,[AD]: ->Ads


2001\03\05@052232 by Simon Nield

flavicon
face
fc oldfile.txt newfile.txt > changes.txt
or
diff oldfile newfile > changes

then parse the changes file if that's not enough... which is so simple i have probably misunderstood
the question.

Regards,
Simon

--
http://www.piclist.com hint: The PICList is archived three different
ways.  See http://www.piclist.com/#archives for details.


2001\03\05@052856 by D Lloyd

flavicon
face
part 1 921 bytes content-type:text/plain; charset=us-ascii
Or, "beyond compare" it......(scooter software). Its also a very good tool
for verifying CD-R backups

Dan




(Embedded     Simon Nield <.....simon.nieldKILLspamspam@spam@QUANTEL.COM>KILLspamspam@spam@MITVMA.MIT.EDU>> image moved   05/03/2001 10:15
to file:
pic25967.pcx)





Please respond to pic microcontroller discussion list
     <
PICLISTspamKILLspamMITVMA.MIT.EDU>
Sent by:  pic microcontroller discussion list <.....PICLISTKILLspamspam.....MITVMA.MIT.EDU>


To:   EraseMEPICLISTspam_OUTspamTakeThisOuTMITVMA.MIT.EDU
cc:
Subject:  Re: [EE]: How to find the changes in a new version of a file

Security Level:?         Internal


fc oldfile.txt newfile.txt > changes.txt
or
diff oldfile newfile > changes

then parse the changes file if that's not enough... which is so simple i
have probably misunderstood
the question.

Regards,
Simon

--
http://www.piclist.com hint: The PICList is archived three different
ways.  See http://www.piclist.com/#archives for details.






part 2 165 bytes content-type:application/octet-stream; (decode)

part 3 131 bytes
--
http://www.piclist.com hint: The PICList is archived three different
ways.  See http://www.piclist.com/#archives for details.


2001\03\05@094048 by severson

flavicon
face
> Subject: [EE]: How to find the changes in a new version of a file

James:

My advice? Don't reinvent the "diff" tools, use them.

I'd take this route:

1) Strip the html tags out of the two files.
2) Diff 'em (turn off white-space differences)
3) Parse the result of the diff file, possibly detecting "false hits" that
may not matter.

This probably isn't too much initial work. You can refine the process if
this shows good results.

-Rob

--
http://www.piclist.com hint: The PICList is archived three different
ways.  See http://www.piclist.com/#archives for details.


2001\03\05@134909 by James Newton

face picon face
No. You understood the question. <SIGH> I'm like that... I can solve
massively complex problems, but will often miss a perfectly simple solution
that is staring me in the face.

Thanks to you and Andy Warren for pointing this out.

James Newton, PICList Admin #3
jamesnewtonspamspam_OUTpiclist.com
1-619-652-0593 phone
http://www.piclist.com

{Original Message removed}

More... (looser matching)
- Last day of these posts
- In 2001 , 2002 only
- Today
- New search...