Data

Data sets:

Other:

Human

File

Path delimiters aren't the same across platforms (/ on *nix, \ on windows). A trailing slash may or may not mean the same thing depending on what operation you're doing. Not all platforms allow or disallow the same characters in directory or filenames. Quoting and escaping spaces in a path name is done differently on different platforms. Multiple delimiters in a row may mean nothing in the file system (e.g. /usr/bin and /usr////bin may be the same path) but splitting those on / will yield different results.

In addition, some special paths like 8+3 short names (C:\>LONGDI~1.COM) include special characters that may not be legal on some platforms. Shortcuts for a home directory like ~ aren't always the same. Etc.

Using string functions or regexes will break and/or not handle things properly.

In other words a path isn't just a string, it's a thing (an object) with semantics and behaviors that are best parsed by a library, which handles all the edge cases and implements all the POSIX/ISO/RFC crap and is well tested so you don't have to worry about it. Just like validating an email address!

Python and PHP have good path libraries. Perl and Ruby too I think. Many of those also handle URI formats like scheme://user@host:port/path/resource with password somewhere in there.


The amount of people who don't realize \directory\directory\file is a valid and absolute path on Windows, or what it means, is too damn high.

Also \\?\UNC\ and other fun stuff.

โ€” commentaires de BundleOfJoysticks sur BREAKING!! NPM package โ€˜ua-parser-jsโ€™ with more than 7M weekly download is compromised

Last updated