[프로젝트] Synthetic dataset 생성 도구 제작(1)

단일세포 및 공간전사체에 대한 합성데이터셋을 생성하는 프로젝트

Posted Feb 26, 2025 Updated Mar 5, 2025

By Cheong seolmo

1 min read

개요

Synthetic dataset을 생성하는 도구 출판이 목적
Network science 분야와 연결

Nezzle로 시각화

input이 배열(layers, 계층구조)로 들어갈 때 각 종류의 피드백이 네트워크 형성 방식에 따라 결정됨
각 노드는 index(row_idx, col_idx})로 구분
각 피드백은 index(src.iden, tar.iden)으로 구분
각 피드백은 랜덤한 갯수로 추가

code

  
  def update(nav, net):
      layers = [6, 4, 7, 5, 8, 3, 6]
      net, entity = create_network(layers)

      for col_idx in range(len(layers)):
          create_edges(net, entity, col_idx)
        
      feedback_types = ["feedbackloop", "crosstalk"]
      polarity_types = ["positive", "negative"]
        
      num_edges = random.randint(20, 50)

      for _ in range(num_edges):
          f_type = random.choice(feedback_types)

          if f_type == "feedbackloop":
              direction_types = ["utd", "dtu"]
          else:
              direction_types = ["utd", "dtu", "same"]

          d_type = random.choice(direction_types)
          p_type = random.choice(polarity_types)

          add_feedback_edges(net, entity, 1, f_type=f_type, d_type=d_type, p_type=p_type)
            
      nav.append_item(net)

img1
img2

NetworkX(+numpy)로 구현

구조는 네트워크를 통해 결정: 노드와 피드백의 갯수가 네트워크를 따름
NetworkX 패키지의 graph generators 사용
- barabasi_albert_graph
  - 스케일-프리 네트워크 생성
  - 기존 노드들의 차수에 비례하여 새로운 노드가 연결될 확률 증가 ⟶ ‘허브’ 노드 형성
- powerlaw_cluster_graph
  - Power-law 분포를 따르는 노드들이 클러스터링 된 구조를 갖는 네트워크를 형성
  - p: 클러스터링 확률
생성된 피드백의 종류와 갯수 출력
code
1

Research, Framework

project

This post is licensed under CC BY 4.0 by the author.

개요

Nezzle로 시각화

NetworkX(+numpy)로 구현

Trending Tags