Windows Flash attenion 2 실행 시키기 (torch 2.1 / cuda 12.1)

728x90

파이토치 기본 Attention 보다 빠르다는

Flash Attention 2를 윈도우에서 실행해 보는 중이다.

https://dajeblog.co.kr/flashattention-v2-%EB%85%BC%EB%AC%B8-%EB%A6%AC%EB%B7%B0-%EA%B8%B0%EC%A1%B4-attention%EB%B3%B4%EB%8B%A4-59%EB%B0%B0-%EB%B9%A0%EB%A5%B8-%EB%8C%80%ED%99%94%EC%B1%97%EB%B4%87-%EB%AA%A8%EB%8D%B8/

FlashAttention v2, [논문 리뷰] 기존 Attention보다 5~9배 빠른 대화(챗봇) 모델을 소개합니다. - NLP AI

1년 만에 Stanford University-FlashAttention이 제안한 새로운 Attention 알고리즘이 진화를 완료했습니다. 이번에는 알고리즘, 병렬화, 작업 분할에서 상당한 개선이 있었고 대형 모델에 대한 적용 가능성

dajeblog.co.kr

시작

여기서 제공하는 모듈을 사용해 봤다.

https://github.com/bdashore3/flash-attention/releases

Releases · bdashore3/flash-attention

Fast and memory-efficient exact attention. Contribute to bdashore3/flash-attention development by creating an account on GitHub.

github.com

버전을 맞춰야 되는데

flash_attn-2.4.1+cu121torch2.1cxx11abiFALSE-cp311-cp311-win_amd64.whl

이 아이를 다운받아봤다.

cuda 도 12.1으로 변경

https://developer.nvidia.com/cuda-12-1-0-download-archive?target_os=Windows&target_arch=x86_64&target_version=11&target_type=exe_local

CUDA Toolkit 12.1 Downloads

developer.nvidia.com

환경변수에서 위치 변경

CUDA_PATH = C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1

그리고 Torch 도 버전 맞춰서 다시 설치

pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu121

그리고 설치 실행

python -m pip install .\flash_attn-2.4.1+cu121torch2.1cxx11abiFALSE-cp311-cp311-win_amd64.whl

설치 완료되면

이 예시에서 이미지를 markdown 형태로 정리해 달라고 한 결과

https://devmeta.tistory.com/100

phi-3.5-vision 사용해 보기 (눈 달린 AI?)

MS에서 좋은 모델이 나왔다길래 한번 사용해 봤다. https://www.aipostkorea.com/news/articleView.html?idxno=2220 MS 소형 언어모델 '파이-3'에 눈 달린다…차트·그래프 등 AI가 보고 답해마이크로소프트(MS)가 지

devmeta.tistory.com

<|user|>
<|image_1|>
이미지 안에 있는 표를 markdown 형태로 정리해줘.<|end|>
<|assistant|>

The image contains a table with information about a company named "SK 이너 네이션" which is a subsidiary of SK E&S. Below is the markdown format of the table:

728x90

저작자표시 (새창열림)

'Python' 카테고리의 다른 글

WSL 사용해서 윈도우(VSCODE)에 Transformer 학습 셋팅 하기 (2)	2024.07.12
윈도우에서 make 실행하기 (0)	2024.05.09

DevMeta

Windows Flash attenion 2 실행 시키기 (torch 2.1 / cuda 12.1)

'Python' 카테고리의 다른 글

티스토리툴바

Windows Flash attenion 2 실행 시키기 (torch 2.1 / cuda 12.1)

'Python' 카테고리의 다른 글

'Python' Related Articles

티스토리툴바