I despise the fact that we live in a world with different end-of-line file formats. Windows/DOS uses CRLF, Unix uses LF, and Mac’s used to use CR1. Thankfully, Mac’s started to adopt the Unix format when OS X was released — if only Windows could do the same.
What I despise even more is that some editors seem to be incapable of determining the difference between a DOS and Unix file. There’s nothing worse than finding a once, perfect Unix file corrupted by a small section of lines with CRLFs while the rest of the file keeps only LFs. Most of the time, the blame can be placed on one’s editor configuration, but I also blame some editor defaults for not at least maintaining the format that the file was opened in. To be fair, most power-editors like emacs, vim, TextMate, etc behave “correctly” by default and keep the format that the file was opened in, but many others (unnamed) do not.
There’s not a whole lot we can do to avoid these problems without hounding our peers, but there are ways to fix these problems after they’re found.
Let’s fix the nastier problem first. When you find a file corrupted with half LFs and half CRLFs, strip out the ^M (CR) characters with a quick search and replace. Run M-% (query-replace) and substitute C-q C-m with nothing. C-q runs quoted-insert and is useful for inserting control characters (e.g. ^M, entered as C-m). Afterwards hit the exclamation point (!) to tell query-replace to replace all matches with no questions.
Other times, you will run into DOS formatted files and will just want to convert them to Unix format for consistency sake. To do this, open the buffer and run C-x <RET> f then enter unix or undecided-unix when prompted for the new coding system. This runs set-buffer-file-coding-system and the result is very similar to running dos2unix myfile.txt at the command line.
1 CR is Carriage Return. LF is Line Feed (aka Newline).
8 responses so far ↓
Fantastic. I never knew that ‘!’ turns a q-r-r into a replace-regexp. I’ve always just done a C-g, then run a replace-regexp reusing the last replacement.
Have you seen the package: http://centaur.maths.qmul.ac.uk/Emacs/files/eol-conversion.el
C-x <RET> C-fin fact isC-x <RET> f.Thanks Christoph. I fixed the typo. I’m sorry if this caused anyone else unnecessary confusion.
Just found your blog and it’s great. Only problem: How am I going to remember these tips when I need them?
Anyway: Run M-% (query-replace) and substitute C-q C-m with nothing. ... Afterwards hit the exclamation point (!) to tell query-replace to replace all matches with no questions.
Why not M-x replace-string and then no (!)?
Oh man, I thought this entry was from a few months ago but it’s actually 3 years old. Errr…
David,
Good point, but mostly, it’s because I don’t have
replace-stringbound to a key binding.M-%is ingrained in my typical workflow.Tips can easily be converted to a macro and saved; thus reused
Leave a Comment