The Power of the Pipeline: Deleting Annoying Files

My dirty little secret, which I guess isn't going to be so secret now, is that I'm a closet Mac user. I love Windows Server as an OS, but I have a tendency to fiddle with Windows client OSs, and that winds up breaking them. By using a Mac, I'm completely disinclined to fiddle, and so I get more work done and have to reinstall my operating system less often (well, "never" thus far). Anyway, one of the only things I dislike about the Mac OS is the way it stores file metadata. Rather than using an alternate "stream" of a file, as NTFS does, Mac OS insists on writing "resource files." So if you have a file named Whatever.txt, there's also a hidden whatever._txt file floating around. The problem is that when you copy Whatever.txt to a USB flash drive to share with someone, they also see the useless-to-them whatever._txt file. When the contents of that USB drive are, say, abunch of training videos that you sell, it's kind of unprofessional to have all those _hidden files on there.

There's actually a neat Mac OS X tool you can buy that will automatically strip the files and/or folders, but I'm not a big fan of "buy" when it comes to software tools. Especially when I have PowerShell! I just wrote myself a one-liner - not a script, mind you (I used to have a multi-line VBScript that did this).

PS H:\> dir -recurse | where { $_.gettype().name -eq 'DirectoryInfo' } | % { $_.getfiles() } | where { $_.name -like '._*' }  | % { del -path $_.fullname -force }

Goodness, I love the pipeline. Here's what's happening:
  • I start by getting a directory - including the -recurse switch, which enumerates subfolders. Notice that I started on the H: drive, which was a USB drive I wanted to clean up. So I'm getting every file, and every folder.
  • Dir will return both folders (directories) and files. In this example, I want to clean directories that start with _, so the next step it to filter out all objects that do not have a type of DirectoryInfo (files would have a type of FileInfo). I'm looking at each object returned by Dir, and executing each one's GetType() method to return its type name. If the type name is equal to DirectoryInfo, it stays; otherwise, it gets dropped.
  • % is an alias to the ForEach-Object cmdlet, and it allows me to execute some operation for each object that remains in the pipeline. So, for each directory - which is all the pipeline will contain at this point - I'm executing the GetFiles() method. That will return a collection of files within that folder. Those get put into the pipeline; after the ForEach-Object cmdlet executes, the pipeline will now contain file objects, and not directory objects.
  • Next, I filter to that I'm only keeping those files whose names end with "._*". Those are the hidden files.
  • I pipe those hidden files to ForEach-Object again, executing the Del command (well, that's an alias to Remove-Item), giving it the full name of the file and adding the -force parameter for good measure (some of those files will be marked read-only, and that forces Del to kill them anyway). 
It's a neat trick - one that took some experimenting. Why did I first filter for directories instead of simply filtering for files? By default, Dir won't return hidden files, but a directory's GetFiles method will do so. So first I get all the directories, then ask them to give me a list of ALL their files - including hidden ones. I then filter on the file's names, getting the ones that match the pattern I'm after. I can change that pattern easily enough to look for other types of files - the example here was for a specific need, but it could easily be altered for others.

Discuss this Blog Entry 6

uSlackr (not verified)
on Sep 16, 2010
Funny, I always though of MAC resource forks and NTFS alternate data streams as different implementations of the same concept. Of course, I'm not aware of anything that uses NTFS streams.
Bewc (not verified)
on Sep 17, 2010
Don,

I feel I have to ask. You are a technical writer and an educator of sorts. Could you please not use aliases/shorthand in your code samples?

It is your choice to use powershell any way you want, but it is different when I have to stop and say "what's that?". It makes it harder to read, harder to understand, etc.

Dir is not the same as get-childitem. Dir is a dos command, and while dir may be an alias for get-childitem, it misleads new comers to powershell, because the flags are different between get-childitem's alias dir, and cmd's command dir.

Where is where-object, which is import to remind people that they are dealing with objects not text.

% is an alias of the ForEach-Object. Which I didn't know myself. For those like me... I ran the following command:
Get-Alias %

There's some articles about this... which I'll reference here, and ultimately the choice is yours. It's your article and your code.

http://poshoholic.com/2007/09/06/essential-powershell-avoid-shorthand-in-shared-powershell-scripts/
http://blogs.msdn.com/b/powershell/archive/2006/04/25/583268.aspx

I am asking that you consider not using shorthand in your writing.

Thanks for the "listen".

-Bewc





















Aleksandar (not verified)
on Sep 19, 2010
I would rather use { $_.psiscontainer } than { $_.gettype().name -eq 'DirectoryInfo' } to filter out the files.
on Sep 16, 2010
They're definitely the same concept, just different implementations. I find Mac's implementation to be more annoying, especially when transporting files to a Windows computer that sees those "hidden" files.
uSlackr (not verified)
on Sep 16, 2010
Sounds like the Mac converts the resource fork to a second file when it is copies to a foreign file system. Interesting. I'm highly Mac illiterate, but I have on on order.

Nice column.

on Sep 17, 2010
Bewc - funny, I usually say the same thing in classes :). Guess I was kind of in a hurry when writing this one.

Please or Register to post comments.

What's PowerShell with a Purpose Blog?

Don Jones demystifies Windows PowerShell.

Blog Archive

Sponsored Introduction Continue on to (or wait seconds) ×