Saturday, February 19, 2022

File Extensions - 19 February 2022

Many moons ago, in the times known as 1997 or so, I was working for Digital Equipment Corporation, commonly known as DEC.  Although the use of the name "DEC" was disparaged by much of the corporation in favor of "digital" (with the small 'd'), I was working in a location proudly known as DECwest in Bellevue WA.  I had been working in the UNIX OS storage software group, named variously Ultrix, DEC OSF/1, and eventually DEC UNIX, but it became time for a change, so I was interviewing for a position in the Windows group.  

In one of the interviews, I was talking to an engineer (rather senior, as I recall) and he asked a question:  how would my program check the file type.  In brief diversion, the file type was simply the type of information contained within a file, be it a directory, a text file, a photo file, an executable file, and so on.  Knowing he was in the Windows group, I gave first the Windows answer; viz., check the file extension.  In this way, ".txt" is a text file, ".jpg" is a photo file in JPEG format, ",exe" is an executable, ".doc" is a text file in WORD format, and so on.  Before he could go on to the next question, I added quickly that this was totally unreliable.  with a trivial name change ('rename'), a file could be marked as any type of file with no regard to the actual contents.  This would cause endless user confusion; "I open the '.doc' file and the application crashes all the time" because someone had renamed a '.txt' file to end with a '.doc'.  Therefore, I counseled, a careful programmer would open the file and look for signature information.  This signature inspection might be guided by the file extension, but the extension should only be treated as a hint or a starting point.  The interviewer did not like this answer because it was, in fact, a major short-coming in Windows.  

I got the job, anyway.



No comments: