#147041 - 05/03/2003 15:07
UNIX directory list parsing help, please
|
old hand
Registered: 15/02/2002
Posts: 1049
|
Hello all,
As I have mentioned before, I have my entire (rather large) CD collection as mp3 on a linux server. I am trying to create a list of the contents in the order in which they were uploaded to the server from my ripping machine. As an aside, I'm doing this because I know that files in a certain date range have only version 1 id3 tags, while others have the date screwed up in the tag, etc.
ls -lag * gives me the correct output, which I redirect into a text file. Here's the problem:
How to actually get that list into some kind of delimited file with real date fields? Here's what the data looks like:
Allman Brothers Band:
total 240
drwxrwxr-x 25 501 501 4096 Feb 18 18:45 .
drwxrwxr-x 1010 501 501 24576 Feb 18 19:03 ..
drwxrwxr-x 2 501 501 4096 Jul 3 2001 A Decade of Hits 1969-1979
.
. (etc) more entries...
.
drwxrwxr-x 2 502 501 4096 Jan 20 2002 Eat A Peach
drwxrwxr-x 2 502 501 4096 Feb 18 18:44 Enlightened Rogues
drwxrwxr-x 2 501 501 4096 Aug 15 2001 Fillmore Concerts Disc 1
drwxrwxr-x 2 501 501 4096 Aug 15 2001 Fillmore Concerts Disc 2
drwxrwxr-x 2 502 501 4096 Feb 18 18:44 Hell & High Water
drwxrwxr-x 2 501 501 4096 Aug 15 2001 Live At The Ludlow Garage 1970 Disc 2
.
.
.
drwxrwxr-x 2 502 501 4096 Feb 18 18:45 Wipe the Windows, Check the Oil
Altan:
total 96
drwxrwxr-x 7 501 501 4096 Jan 17 2002 .
drwxrwxr-x 1010 501 501 24576 Feb 18 19:03 ..
You see, the date is actually 3 fixed-width fields, but if the date is this year, it puts the time instead of the date. The artist (parent directory) appears above with a ":" character at the end. There is a blank line before each new artist. The output I want is:
Artist, Album, Date
Where the Artist comes from the line above the contents the Album is the subdirectory name, and the Date is a usable reformatting of the 3 fixed width date fields. All of the other stuff, the blank lines and the . and .. lines need to be stripped.
By changing the system date on the machine, to 2004 for instance, I can eliminate the time vs. year problem, but I'm still having problems parsing this file.
Times like these I wish I knew perl. Can anyone rip this out off the top of their head?
Oh yeah, one last thing that helps is that the number right after the permisssions in the "." line contains the number of entries in that Artist subdirectory + 2 (+2 because . and .. both count as entries).
Thanks in advance,
Jim
|
Top
|
|
|
|
#147042 - 05/03/2003 15:16
Re: UNIX directory list parsing help, please
[Re: TigerJimmy]
|
carpal tunnel
Registered: 27/06/1999
Posts: 7058
Loc: Pittsburgh, PA
|
Well one thing you can do is pass the --full-time switch to your ls command, that will make the dates easier to deal with.
|
Top
|
|
|
|
#147043 - 05/03/2003 15:23
Re: UNIX directory list parsing help, please
[Re: TigerJimmy]
|
carpal tunnel
Registered: 25/12/2000
Posts: 16706
Loc: Raleigh, NC US
|
The better way would be to write a perl script that would deal with the files' metadata directly instead of trying to parse ls's output. If you're stuck with that, though, you can use ls's --full-time option, if you're using GNU ls.
Then you can use cut and/or awk and date -d to get it rectified, if needed.
_________________________
Bitt Faulk
|
Top
|
|
|
|
#147044 - 05/03/2003 15:31
Re: UNIX directory list parsing help, please
[Re: wfaulk]
|
carpal tunnel
Registered: 25/12/2000
Posts: 16706
Loc: Raleigh, NC US
|
Assuming your directory structure is consistently Artist/Album, then this ought to work: #!/usr/bin/ksh
for artist in *; do
cd $artist
for album in *; do
echo -e "$artist $album\c"
ls --full-time -d $album | cut -c 44-67
done
cd ..
done
_________________________
Bitt Faulk
|
Top
|
|
|
|
#147045 - 05/03/2003 15:35
Re: UNIX directory list parsing help, please
[Re: wfaulk]
|
old hand
Registered: 15/02/2002
Posts: 1049
|
Well, the -T option in linux gives the full time, so that's a help. What I need now is to get the parent directory name to appear for each entry, or figure out some way to do that. This is closer, though. I didn't even think that there would be a "full time" option on ls...
Jim
|
Top
|
|
|
|
#147046 - 05/03/2003 15:37
Re: UNIX directory list parsing help, please
[Re: wfaulk]
|
old hand
Registered: 15/02/2002
Posts: 1049
|
Whoa. Thanks, Bitt! I'll give that a shot. I appreciate the help! I'll let you know how it works...
Thanks again,
Jim
|
Top
|
|
|
|
#147047 - 05/03/2003 15:40
Re: UNIX directory list parsing help, please
[Re: TigerJimmy]
|
carpal tunnel
Registered: 25/12/2000
Posts: 16706
Loc: Raleigh, NC US
|
I'm not sure if it's what you're looking for, but ls --full-time */* | sed 's#/# #' will also get you pretty close.
_________________________
Bitt Faulk
|
Top
|
|
|
|
#147048 - 05/03/2003 15:51
Re: UNIX directory list parsing help, please
[Re: wfaulk]
|
old hand
Registered: 15/02/2002
Posts: 1049
|
OK, no ksh on the system. Grrrr. I understand what you're doing, though, and I'll rewrite it in bash or csh. Or, I'll give your alternate version a try.
Thanks again,
Jim
|
Top
|
|
|
|
#147049 - 05/03/2003 15:54
Re: UNIX directory list parsing help, please
[Re: TigerJimmy]
|
old hand
Registered: 15/02/2002
Posts: 1049
|
Awright, the problem with the script is that the directory names have spaces in them so it tries to list each word separately...
|
Top
|
|
|
|
#147050 - 05/03/2003 16:29
Re: UNIX directory list parsing help, please
[Re: TigerJimmy]
|
carpal tunnel
Registered: 18/01/2000
Posts: 5683
Loc: London, UK
|
$ find . -type f -print0 | xargs -0 ls --full-time
_________________________
-- roger
|
Top
|
|
|
|
#147051 - 05/03/2003 16:31
Re: UNIX directory list parsing help, please
[Re: Roger]
|
member
Registered: 16/12/1999
Posts: 188
Loc: Melbourne, Australia
|
And to do the whole thing...
$ find . -type f -print0 | xargs -0 ls --full-time | cut -c44- | sort
|
Top
|
|
|
|
#147052 - 05/03/2003 17:24
Re: UNIX directory list parsing help, please
[Re: rjlov]
|
carpal tunnel
Registered: 27/06/1999
Posts: 7058
Loc: Pittsburgh, PA
|
Won't that sort command sort them by day of the week instead of the date?
|
Top
|
|
|
|
#147053 - 06/03/2003 03:13
Re: UNIX directory list parsing help, please
[Re: Roger]
|
carpal tunnel
Registered: 13/07/2000
Posts: 4180
Loc: Cambridge, England
|
$ find . -type f -print0 | xargs -0 ls --full-time
[puts on stripy shirt, blows whistle, gesticulates]
Unix foul. Using two commands where one would do. Five-yard penalty. First down.
$ find . -type f -a -printf "%T@ %Td-%Tb-%TY %p\n" | sort -g | cut -f2- -d' '
Peter
Edit: should be -printf, not printf
|
Top
|
|
|
|
#147054 - 06/03/2003 08:10
Re: UNIX directory list parsing help, please
[Re: peter]
|
old hand
Registered: 15/02/2002
Posts: 1049
|
OK, I am willing to admit it. You guys have a bigger UNIX than me. :-)
Seriously, thanks for the help. Only problem is, none of these things work.
First, the -f on find finds the files, not the "Album" subdirectories. Changing it to -d finds all the directories, not just the Album ones (subdirectories), and the nice single command option doesn't work at all.
I'd really like to get this working. I'm not good at the regular expression stuff and I appreciate all of your help.
I have two systems that I can access the volume from. One is a RedHat linux machine (older, 6.2, I think, with a new kernel) and the other is an OpenBSD 3.0 machine. So, those are the commands I can use, though I can (and would) add whatever command(s) I need to get this to work.
ls on these machines has a -T option that gives full time.
The nice printf solution doesn't work, as far as I can tell, for two reasons: 1. I can't get find to output to printf without using a pipe (or at least, I don't know how on my system -- there is no -a option), and 2. The formatting doesn't work:
find . -type f | printf "%T@ %Td-%Tb-%TY %p\n" | more
-csh: illegal format character
The original:
find . -type f -a printf "%T@ %Td-%Tb-%Ty &p\n" | more
find: printf: unknown option
Comments?
Thanks again,
Jim
|
Top
|
|
|
|
#147055 - 06/03/2003 08:15
Re: UNIX directory list parsing help, please
[Re: TigerJimmy]
|
carpal tunnel
Registered: 13/07/2000
Posts: 4180
Loc: Cambridge, England
|
find . -type f -a printf "%T@ %Td-%Tb-%Ty &p\n" | more
find: printf: unknown option
Oops. That should be -printf, with a dash. I'll edit the post. And surely you're interested in fixing the files that were created at a certain time, not the directories?
And in fact you probably should change the %T's to %C's in order to get ctime instead of mtime.
Peter
|
Top
|
|
|
|
#147056 - 06/03/2003 08:21
Re: UNIX directory list parsing help, please
[Re: TigerJimmy]
|
carpal tunnel
Registered: 25/12/2000
Posts: 16706
Loc: Raleigh, NC US
|
He meant for there to be a `-' before the printf: find . -type f -a -printf "%T@ %Td-%Tb-%Ty &p\n"
_________________________
Bitt Faulk
|
Top
|
|
|
|
#147057 - 06/03/2003 08:22
Re: UNIX directory list parsing help, please
[Re: peter]
|
old hand
Registered: 15/02/2002
Posts: 1049
|
Well, yes, but I ripped them as entire CDs, so I want a list of CDs that I need to take another look at. Then I'll fix up all the files, or potentailly re-rip them, depending upon which category they are in.
JC
|
Top
|
|
|
|
#147058 - 06/03/2003 08:34
Re: UNIX directory list parsing help, please
[Re: TigerJimmy]
|
old hand
Registered: 15/02/2002
Posts: 1049
|
You know, this many flavors of unix thing sucks. -printf is not an option on my find. There is -print and -print0 but no -printf
"No problem", says me, "that's what pipes are for":
find . -type f -print | printf "%T@ %Td-%Tb-%Ty &p\n"
-csh: illegal format character
Standards are good things. Does this suck horribly, or is it just me? Am I back to redoing Bitt's script in csh? Am I a stupid ass for not being able to figure this out? (sure feels like it)...
Damnation. I think my printf sucks, too. man printf gives these arguments:
Format:
A character which indicates the type of format to use (one of
diouxXfEgGbcs).
A field width or precision may be `*' instead of a digit string. In this
case an argument supplies the field width or precision.
The format characters and their meanings are:
diouXx The argument is printed as a signed decimal (d or i), un-
signed octal, unsigned decimal, or unsigned hexadecimal (x or
X), respectively.
f The argument is printed in the style [-]ddd.ddd where the
number of d's after the decimal point is equal to the preci-
sion specification for the argument. If the precision is
missing, 6 digits are given; if the precision is explicitly
0, no digits and no decimal point are printed.
eE The argument is printed in the style [-]d.ddde+-dd where
there is one digit before the decimal point and the number
after is equal to the precision specification for the argu-
ment; when the precision is missing, 6 digits are produced.
An upper-case `E' is used for an E format.
gG The argument is printed in style f or in style e (E) whichev-
er gives full precision in minimum space.
b Characters from the string argument are printed with back-
slash-escape sequences expanded.
c The first character of argument is printed.
s Characters from the string argument are printed until the end
is reached or until the number of characters indicated by the
precision specification is reached; however if the precision
is 0 or missing, all characters in the string are printed.
% Print a `%'; no argument is used.
|
Top
|
|
|
|
#147059 - 06/03/2003 08:36
Re: UNIX directory list parsing help, please
[Re: TigerJimmy]
|
carpal tunnel
Registered: 27/06/1999
Posts: 7058
Loc: Pittsburgh, PA
|
Standards are good things.
And the best thing about standards is that there are so many of them to choose from.
|
Top
|
|
|
|
#147060 - 06/03/2003 08:44
Re: UNIX directory list parsing help, please
[Re: TigerJimmy]
|
carpal tunnel
Registered: 13/07/2000
Posts: 4180
Loc: Cambridge, England
|
You know, this many flavors of unix thing sucks. -printf is not an option on my find. There is -print and -print0 but no -printf
Your find sucks. That option is in GNU find 4.1 of November 1994. Try it on the Redhat box? What does "find --version" say?
The standalone printf command doesn't stand a chance of doing the right thing as it doesn't have all the information that find does.
Peter
|
Top
|
|
|
|
#147061 - 06/03/2003 08:49
Re: UNIX directory list parsing help, please
[Re: peter]
|
carpal tunnel
Registered: 25/12/2000
Posts: 16706
Loc: Raleigh, NC US
|
Screw it. #!/usr/bin/perl
opendir(TOP, ".");
foreach $dir (readdir(TOP)) {
next if $dir eq ".";
next if $dir eq "..";
if ( -d $dir ) {
opendir(SUB, $dir);
foreach $subdir (readdir(SUB)) {
next if $subdir eq ".";
next if $subdir eq "..";
if ( -d "$dir/$subdir" ) {
$mtime = (stat("$dir/$subdir"))[9];
print "$dir $subdir " . localtime($mtime) . "\n";
}
}
closedir(SUB);
}
}
closedir(TOP); Edit: cleaned up getting the mtime
Edited by wfaulk (06/03/2003 08:53)
_________________________
Bitt Faulk
|
Top
|
|
|
|
#147062 - 06/03/2003 09:16
Re: UNIX directory list parsing help, please
[Re: wfaulk]
|
old hand
Registered: 15/02/2002
Posts: 1049
|
Thank you again! I really appreciate the help. I do realize that you all have other stuff going on and I really appreciate the time you all have spent. Thank you.
The perl script just about does it. For some reason, the date is goofed up. All the directories show between Mar 2 and today and some time that I can't figure out. I don't really know what date it could be, actually, not the last accessed date... Not the creation time of the file, at any rate.
That bit should be possible for me to figure out, however. I'm also going to change it to tab-delimited or fixed-width so I can use it more easily.
This is a huge help, Bitt.
Thanks again!
Jim
|
Top
|
|
|
|
#147063 - 06/03/2003 09:32
Re: UNIX directory list parsing help, please
[Re: TigerJimmy]
|
carpal tunnel
Registered: 25/12/2000
Posts: 16706
Loc: Raleigh, NC US
|
As to formatting, read the perlform man page.
As to times on directories, this is one of the minor pieces of Unix arcana.
There are three times associated with every Unix file (or directory, since directories are actually files, too). There's the ctime, mtime, and atime. atime is access time. It's the last time that someone looked at the data in the file. mtime is the modification time. It's the last time someone changed the data in the file. ctime is change time, not! creation time. It's the time that the file's metadata was changed. This is stuff like file creation, obviously, chown, chmod, etc. That last one trips people up. There's no dedicated creation timestamp in Unix.
But then you have to understand how directories are modified. When a new file is placed in a directory or is removed, all of the directory's times change. If any of a file's times change, the directory's atime changes. I think that's it.
I don't know exactly which time you're looking for.
_________________________
Bitt Faulk
|
Top
|
|
|
|
#147064 - 06/03/2003 09:37
Re: UNIX directory list parsing help, please
[Re: wfaulk]
|
old hand
Registered: 15/02/2002
Posts: 1049
|
Looked up "stat" in my perl book (really, I'm going to learn perl, honest!), and we just needed another undef in the argument list for stat. We were getting atime instead of mtime.
Lovely! Awesome!
I need to fix the formatting of the date because I still have that fixed-width issue with the day field. Or, I need to sort the output on date. Those things should be simple enough for me to figure out. I'm off and running.
Thanks again everyone!
Jim
|
Top
|
|
|
|
#147065 - 06/03/2003 09:47
Re: UNIX directory list parsing help, please
[Re: TigerJimmy]
|
carpal tunnel
Registered: 25/12/2000
Posts: 16706
Loc: Raleigh, NC US
|
Take a look at the updated script above. It's a little easier to read with respect to the stat output (array reference vs. all those undefs).
Also, if you're going to sort, I'd put all the info in an array (of arrays, probably), and then sort on the time. But make sure you don't localtime() it until you print out. It'll be much easier to sort on the raw timestamp (which is just the number of seconds past Unix epoch, as usual) than to try to do it on the munged output-style data.
_________________________
Bitt Faulk
|
Top
|
|
|
|
#147066 - 06/03/2003 10:00
Re: UNIX directory list parsing help, please
[Re: wfaulk]
|
carpal tunnel
Registered: 25/12/2000
Posts: 16706
Loc: Raleigh, NC US
|
Eh, what the hell: #!/usr/bin/perl
opendir(TOP, ".");
$i = 0;
foreach $dir (readdir(TOP)) {
next if $dir eq ".";
next if $dir eq "..";
if ( -d $dir ) {
opendir(SUB, $dir);
foreach $subdir (readdir(SUB)) {
next if $subdir eq ".";
next if $subdir eq "..";
if ( -d "$dir/$subdir" ) {
$mtime = (stat("$dir/$subdir"))[9];
$list{"$dir/$subdir"} = $mtime;
}
}
closedir(SUB);
}
}
closedir(TOP);
format =
@<<<<<<<<<<<<<<<<<<<<<<<<< @<<<<<<<<<<<<<<<<<<<<<<<<<<< @<<<<<<<<<<<<<<<<<<<<<<<
$artist, $album, $time
.
foreach $sdir (sort { $list{$a} <=> $list{$b} } keys %list) {
($artist, $album) = split(/\//, $sdir);
$time = localtime($list{$sdir});
write;
}
It's attached, too.
Attachments
145254-ddir.pl (182 downloads)
_________________________
Bitt Faulk
|
Top
|
|
|
|
#147067 - 06/03/2003 12:57
Re: UNIX directory list parsing help, please
[Re: wfaulk]
|
old hand
Registered: 15/02/2002
Posts: 1049
|
It worked! I have what I need.
You're the greatest. Thanks a ton, Bitt. Thanks also to all of you who tried to help me get this done from the UNIX prompt. I was able to make some changes to the perl script and get the output format that I wanted.
Anyhow, thanks again,
Jim
|
Top
|
|
|
|
#147068 - 06/03/2003 16:48
Re: UNIX directory list parsing help, please
[Re: wfaulk]
|
Carpal Tunnel
Registered: 08/02/2002
Posts: 3411
|
I think that is some of the easiest reading perl that I've ever seen. You're obviously not a real(tm) perl coder!
_________________________
Mk2a 60GB Blue. Serial 030102962
sig.mp3: File Format not Valid.
|
Top
|
|
|
|
#147069 - 06/03/2003 17:26
Re: UNIX directory list parsing help, please
[Re: genixia]
|
carpal tunnel
Registered: 25/12/2000
Posts: 16706
Loc: Raleigh, NC US
|
Even foreach $sdir (sort { $list{$a} <=> $list{$b} } keys %list) ?
And you've gotta wonder what that ``$i = 0'' is in there for. (Oops.)
_________________________
Bitt Faulk
|
Top
|
|
|
|
#147070 - 06/03/2003 17:30
Re: UNIX directory list parsing help, please
[Re: wfaulk]
|
carpal tunnel
Registered: 27/06/1999
Posts: 7058
Loc: Pittsburgh, PA
|
Because every program needs an "i" loop counter, whether it's used or not.
|
Top
|
|
|
|
|
|