There was a time in ancient computer history when a computer only had one CPU. Today, your computer may still only have a single physical CPU, but that one CPU has multiple cores for data processing. When you run a command, you owe it to the brave sysadmins of the past to put all those cores to good use. One way to honor those who suffered on single-core machines is to use GNU Parallel, the seemingly magical command parser that can execute a task on several files simultaneously.
[ Get the guide to installing applications on Linux. ]
Install Parallel
On CentOS, RHEL, and Fedora, you can install GNU Parallel from your software repository:
$ sudo dnf install parallel
On CentOS and RHEL, you can sometimes find the latest version from EPEL.
Launch Parallel for the first time
The first time you use GNU Parallel, it asks you to agree to cite when you use Parallel in scientific research. Academic tradition requires you to cite works you base your article on. If you use programs that use GNU Parallel to process data for an article in a scientific publication, please cite:Tange, O. (2022, August 22). GNU Parallel 20220822 ('Rushdie'). Zenodo. https://doi.org/10.5281/zenodo.7015730
This citation helps fund further development, and it won't cost you a cent. If you pay 10,000 EUR, you should feel free to use GNU Parallel without citing. Check the GNU website to find out more about funding GNU Parallel and the citation notice.
To silence this citation notice, run parallel --citation
once. Read the notice and follow the instructions to silence the reminder.
[ Keep your favorite Git commands, aliases, and tips close at hand. Download the Git cheat sheet. ]
Pipe output to Parallel
Assuming you're already familiar with essential Linux commands like find
and ls
, one of the easiest ways to get started with GNU Parallel is to feed it with the results of a command you already understand. For instance, suppose you want to move some log files (ignoring, for the moment, that you may be using logrotate or a similar tool in real life).
$ sudo find /var/log/ -type f -name "*.log" | \
sudo parallel mv {} ~/log-stash
In this code, the braces ({}
) stand in for the results of find
.
Learn Parallel syntax
While existing Linux utilities can act as a convenient "front end" for Parallel, you can also just use the parallel
command to construct processes. The concept is straightforward, although the logic can sometimes get complex, depending on how many tasks you want to run. Starting simply, here's a basic parallel
command:
$ parallel echo {} ::: hello world
hello
world
Notice that the instruction is separated by three semi-colons (:::
), with the command on the left and the arguments on the right. If you try that command, you might get hello world
or world hello
back, depending on which process completes first.
Suppose you want to convert some large media files. Instead of encoding the files one after another, you can instead use GNU Parallel to launch separate instances of your encoder, each one targeting a different codec:
$ parallel ffmpeg ~/Audio/file.flac ~/Audio/file.{} ::: ogg m4a opus
[ Get the IT job interview tips cheat sheet. ]
Use multiple variables
Parallel isn't limited to just one {}
variable. You can create several inputs and then define them by an index number that reflects the order they're listed. Compare this output:
$ parallel echo {1} {2} ::: hello Linux ::: world sysadmin
hello world
hello sysadmin
Linux world
Linux sysadmin
In this code sample, {1}
indicates the first "block" of input (hello
and Linux
) while {2}
indicates the second "block" (world
and sysadmin
). They don't have to appear in that order, nor are they limited to a single-use:
$ parallel echo {2} {1} {2} ::: hello Linux ::: world sysadmin
world hello world
sysadmin hello sysadmin
world Linux world
sysadmin Linux sysadmin
Parallel processing
They say that with great power comes great responsibility, but ideally, with great power also comes great parallelization. The computer in front of you is probably more powerful than what you need most of the time, so you may as well make your everyday commands faster by taking advantage of otherwise wasted cycles. Use GNU Parallel.
저자 소개
Seth Kenlon is a Linux geek, open source enthusiast, free culture advocate, and tabletop gamer. Between gigs in the film industry and the tech industry (not necessarily exclusive of one another), he likes to design games and hack on code (also not necessarily exclusive of one another).
채널별 검색
오토메이션
기술, 팀, 인프라를 위한 IT 자동화 최신 동향
인공지능
고객이 어디서나 AI 워크로드를 실행할 수 있도록 지원하는 플랫폼 업데이트
오픈 하이브리드 클라우드
하이브리드 클라우드로 더욱 유연한 미래를 구축하는 방법을 알아보세요
보안
환경과 기술 전반에 걸쳐 리스크를 감소하는 방법에 대한 최신 정보
엣지 컴퓨팅
엣지에서의 운영을 단순화하는 플랫폼 업데이트
인프라
세계적으로 인정받은 기업용 Linux 플랫폼에 대한 최신 정보
애플리케이션
복잡한 애플리케이션에 대한 솔루션 더 보기
오리지널 쇼
엔터프라이즈 기술 분야의 제작자와 리더가 전하는 흥미로운 스토리
제품
- Red Hat Enterprise Linux
- Red Hat OpenShift Enterprise
- Red Hat Ansible Automation Platform
- 클라우드 서비스
- 모든 제품 보기
툴
체험, 구매 & 영업
커뮤니케이션
Red Hat 소개
Red Hat은 Linux, 클라우드, 컨테이너, 쿠버네티스 등을 포함한 글로벌 엔터프라이즈 오픈소스 솔루션 공급업체입니다. Red Hat은 코어 데이터센터에서 네트워크 엣지에 이르기까지 다양한 플랫폼과 환경에서 기업의 업무 편의성을 높여 주는 강화된 기능의 솔루션을 제공합니다.