I despise the fact that we live in a world with different end-of-line file formats. Windows/DOS uses CRLF, Unix uses LF, and Mac’s used to use CR1. Thankfully, Mac’s started to adopt the Unix format when OS X was released — if only Windows could do the same.
What I despise even more is that some editors seem to be incapable of determining the difference between a DOS and Unix file. There’s nothing worse than finding a once, perfect Unix file corrupted by a small section of lines with CRLFs while the rest of the file keeps only LFs. Most of the time, the blame can be placed on one’s editor configuration, but I also blame some editor defaults for not at least maintaining the format that the file was opened in. To be fair, most power-editors like emacs, vim, TextMate, etc behave “correctly” by default and keep the format that the file was opened in, but many others (unnamed) do not.
There’s not a whole lot we can do to avoid these problems without hounding our peers, but there are ways to fix these problems after they’re found.
Let’s fix the nastier problem first. When you find a file corrupted with half LFs and half CRLFs, strip out the ^M (CR) characters with a quick search and replace. Run
query-replace) and substitute
C-q C-m with nothing.
quoted-insert and is useful for inserting control characters (e.g. ^M, entered as
C-m). Afterwards hit the exclamation point (
!) to tell query-replace to replace all matches with no questions.
Other times, you will run into DOS formatted files and will just want to convert them to Unix format for consistency sake. To do this, open the buffer and run
C-x <RET> f then enter
undecided-unix when prompted for the new coding system. This runs
set-buffer-file-coding-system and the result is very similar to running
dos2unix myfile.txt at the command line.