실행기 메모리 사용량이 0에서 멈춘 이유는 무엇입니까?

debugcn 에 게시 Dev

Russ Weeks

다음과 같은 매우 간단한 Spark 작업이 있습니다.

JavaPairRDD<Key,Value> rawData = newAccumuloRDD(...);
JavaPairRDD<Key,Value> indexSrc =
    rawData.filter(new IndexFilter()).cache();
JavaPairRDD<Key,Value> indexEntries =
    indexSrc.mapPartitionsToPair(new IndexBuilder(numPartitions));
JavaPairRDD<Key,Value> reverseIndexEntries =
    indexSrc.mapPartitionsToPair(new ReverseIndexBuilder(numPartitions));
JavaPairRDD<Key,Value> dataEntries =
    rawData.mapPartitionsToPair(new DataBuilder(numPartitions)).cache();

dataEntries.union(indexEntries)
  .union(reverseIndexEntries)
  .repartitionAndSortWithinPartitions(new PartitionedIndexRDDPartitioner(NUM_BINS))
  .saveAsNewAPIHadoopFile(pidxBulk.toString(), Key.class, Value.class,
      AccumuloFileOutputFormat.class, conf);

여기서 Key와 Value는 Apache Accumulo의 Key 및 Value 클래스 (KryoSerializer 사용)입니다.

cache ()에 대한 호출을 정확히 어디에 넣을지 또는 전혀 필요한지 모르겠습니다. 그러나 내 실행자가 내가 할당 한 메모리를 많이 사용하지 않는 것 같습니다.

사용 된 메모리가 없음을 보여주는 스크린 샷

그리고 응용 프로그램 UI의 "저장소"페이지가 비어 있습니다.

내가 뭔가 잘못했거나 Spark가 내 RDD를 저장하여이 작업을 더 빨리 진행할 수 없다고 결정 했습니까?

반 자라

사용 된 메모리는 캐싱에 사용되는 메모리를 의미합니다.

코드에서 하나의 작업 만 수행 하고 indexSrc 또는 dataEntries는 다시 사용되지 않으므로 캐싱 할 지점이 없습니다.

그것을 증명하기 위해 추가 할 수 있습니다.

indexSrc.count();그리고 dataEntries.count();다음을 선언하고 후 집행자 / 저장 페이지를 확인.

JavaPairRDD<Key,Value> rawData = newAccumuloRDD(...);
JavaPairRDD<Key,Value> indexSrc = rawData.filter(new IndexFilter()).cache();
indexSrc.count();
JavaPairRDD<Key,Value> indexEntries = indexSrc.mapPartitionsToPair(new IndexBuilder(numPartitions));
JavaPairRDD<Key,Value> reverseIndexEntries = indexSrc.mapPartitionsToPair(new ReverseIndexBuilder(numPartitions));
JavaPairRDD<Key,Value> dataEntries = rawData.mapPartitionsToPair(new DataBuilder(numPartitions)).cache();
dataEntries.count();

dataEntries.union(indexEntries)
  .union(reverseIndexEntries)
  .repartitionAndSortWithinPartitions(new PartitionedIndexRDDPartitioner(NUM_BINS))
  .saveAsNewAPIHadoopFile(pidxBulk.toString(), Key.class, Value.class,
      AccumuloFileOutputFormat.class, conf);

이 기사는 인터넷에서 수집됩니다. 재 인쇄 할 때 출처를 알려주십시오.

침해가 발생한 경우 연락 주시기 바랍니다[email protected] 삭제

에서 수정2021-06-3

몇 마디 만하겠습니다

0리뷰

로그인참여 후 검토

Related 관련 기사

기사

실행기 메모리 사용량이 0에서 멈춘 이유는 무엇입니까?

실행기 메모리 사용량이 0에서 멈춘 이유는 무엇입니까?

스캔이 작동을 멈춘 이유는 무엇입니까 (메모리 할당 사용)?

내 Flask 앱이 가져 오기 단계에서 멈춘 이유는 무엇입니까?

바로 가기 키 Ctrl-Alt-N이 Visual Studio Code에서 작동을 멈춘 이유는 무엇입니까?

14.04에서 내 오디오가 갑자기 작동을 멈춘 이유는 무엇입니까?

JS 애니메이션이 멈춘 이유는 무엇입니까?

카운트 다운이 "1"에서 멈춘 이유는 무엇입니까?

패키지 버전에서 멈춘 이유는 무엇입니까?

ctrl-f가 LibreOffice에서 작동을 멈춘 이유는 무엇입니까?

pop () 메서드가 map 메서드 내에서 사용될 때 작동을 멈춘 이유는 무엇입니까?

다중 추가 중에 동시 SkipListSet이 멈춘 이유는 무엇입니까?

내 WIndows 8 설치가 루프에 멈춘 이유는 무엇입니까?

0에서 멈춘 타이머 코드 실행

ps와 free 사이에 메모리 사용량에 큰 차이가있는 이유는 무엇입니까?

터치 된 SKSpriteNode가 게임보기에서 빠르게 드래그 된 후 이동을 멈춘 이유는 무엇입니까?

DocuSign EventNotification이 작동을 멈춘 이유는 무엇입니까?

future.get ()이 멈춘 이유는 무엇입니까?

Wine에서 프로그램이 작동을 멈춘 이유를 찾는 방법은 무엇입니까?

ThreadedProcessPoolExecutor가 멈춘 이유는 무엇입니까?

내 CPU 주파수가 멈춘 이유는 무엇입니까?

Windows 8 작업 관리자가 멈춘 것처럼 보이는 이유는 무엇입니까?

실행을 위해 OpenCL 커널을 대기열에 추가하면 프로그램의 메모리 사용이 증가하는 이유는 무엇입니까?

리소스 모니터 및 작업 관리자의 총 RAM 사용량이 원격으로 총 실제 메모리 사용량에 합산되지 않는 이유는 무엇입니까?

Windows가 시작 화면에 너무 오랫동안 "멈춘"이유는 무엇입니까?

내 VB.NET WebRequest가 갑자기 작동을 멈춘 이유는 무엇입니까?

공유 메모리를 사용하기 전에 내 GPU가 모든 전용 메모리를 사용하지 않는 이유는 무엇입니까?

Fargate에서 예약 실패로 보류 상태에서 멈춘 컨테이너를 디버깅하는 방법은 무엇입니까?

Openshift가 로컬에서 실행되는 동일한 Docker 컨테이너보다 훨씬 더 많은 컨테이너 메모리 사용량을보고하는 이유는 무엇입니까?

Openshift가 로컬에서 실행되는 동일한 Docker 컨테이너보다 훨씬 더 많은 컨테이너 메모리 사용량을보고하는 이유는 무엇입니까?

게임에서 CPU 사용량이 급증 할 때 NAudio 음악이 멈추는 이유는 무엇입니까?