chaidnode 특성

마지막 업데이트 날짜: 2025년 2월 11일

chaidnode 특성

CHAID 노드 아이콘 CHAID 노드는 최적 분할을 식별하기 위해 카이제곱 통계를 사용하여 의사결정 트리를 생성합니다. C&R 트리 및 QUEST 노드와 달리, CHAID는 비이분형 트리를 생성할 수 있으며 이는 일부 분할에 둘 이상의 분기가 있음을 의미합니다. 목표 및 입력 필드는 숫자 범위(연속형) 또는 범주형입니다. Exhaustive CHAID는 가능한 모든 분할을 탐색하는 보다 철저한 작업을 수행하지만 계산하는 데 시간이 더 걸리는 변형 CHAID입니다.

예

stream = modeler.script.stream()
sourcenode = stream.findByID("id46WRP1285C")

node = stream.createAt("chaid", "My node", 200, 100)
stream.link(sourcenode, node)

node.setPropertyValue("custom_fields", True)
node.setPropertyValue("target", "Drug")
node.setPropertyValue("inputs", ["Age", "Na", "K", "Cholesterol", "BP"])
node.setPropertyValue("use_model_name", True)
node.setPropertyValue("model_name", "CHAID")
node.setPropertyValue("method", "Chaid")
node.setPropertyValue("model_output_type", "InteractiveBuilder")
node.setPropertyValue("use_tree_directives", True)
node.setPropertyValue("tree_directives", "Test")
node.setPropertyValue("split_alpha", 0.03)
node.setPropertyValue("merge_alpha", 0.04)
node.setPropertyValue("chi_square", "Pearson")
node.setPropertyValue("use_percentage", False)
node.setPropertyValue("min_parent_records_abs", 40)
node.setPropertyValue("min_child_records_abs", 30)
node.setPropertyValue("epsilon", 0.003)
node.setPropertyValue("max_iterations", 75)
node.setPropertyValue("split_merged_categories", True)
node.setPropertyValue("bonferroni_adjustment", True)

표 1. chaidnode 특성
`chaidnode` 특성	데이터 유형 또는 값	특성 설명
`target`	필드	CHAID 모델은 하나의 대상과 하나 이상의 입력 필드가 필요합니다. 빈도를 지정할 수도 있습니다. 자세한 정보는 공통 모델링 노드 특성을 참조하십시오.
`continue_training_existing_model`	플래그
`objective`	`Standard` `Boosting` `Bagging` `psm`	`psm` 는 대형 데이터 세트에 사용되며 서버 연결이 필요합니다.
`model_output_type`	`Single` `InteractiveBuilder`
`use_tree_directives`	플래그
`tree_directives`	문자열
`method`	`Chaid` `ExhaustiveChaid`
`use_max_depth`	`Default` `Custom`
`max_depth`	정수	0부터 1000까지의 최대 트리 깊이입니다. `use_max_depth = Custom`인 경우에만 사용됩니다.
`use_percentage`	플래그
`min_parent_records_pc`	Number
`min_child_records_pc`	Number
`min_parent_records_abs`	Number
`min_child_records_abs`	Number
`use_costs`	플래그
`costs`	구조화된	구조화 특성입니다.
`trails`	Number	부스팅 또는 배깅을 위한 구성요소 모델 수입니다.
`set_ensemble_method`	`Voting` `HighestProbability` `HighestMeanProbability`	범주형 대상을 결합하기 위한 기본 규칙입니다.
`range_ensemble_method`	`Mean` `Median`	연속형 대상에 대한 기본 결합 규칙입니다.
`large_boost`	플래그	대형 데이터 세트에 부스팅을 적용합니다.
`split_alpha`	Number	분할 유의 수준입니다.
`merge_alpha`	Number	병합 유의 수준입니다.
`bonferroni_adjustment`	플래그	Bonferroni 방법을 사용하여 유의수준을 조정합니다.
`split_merged_categories`	플래그	병합된 범주의 재분할을 허용합니다.
`chi_square`	`Pearson` `LR`	카이 제곱 통계를 계산하는 데 사용되는 방법: Pearson 또는 우도비
`epsilon`	Number	셀 기대빈도의 최소 변화입니다.
`max_iterations`	Number	수렴을 위한 최대 반복입니다.
`set_random_seed`	정수
`seed`	Number
`calculate_variable_importance`	플래그
`calculate_raw_propensities`	플래그
`calculate_adjusted_propensities`	플래그
`adjusted_propensity_partition`	`Test` `Validation`
`maximum_number_of_models`	정수
`train_pct`	이중 실선	알고리즘은 내부적으로 레코드를 모델 작성 세트 및 과적합 방지 세트로 분리합니다. 과적합 방지 세트는 훈련 중에 오류를 추적하는 데 사용되는 독립 데이터 레코드 세트이며, 이는 메소드가 데이터에서 기회 변동을 모델링하지 못하게 합니다. 레코드 퍼센트를 지정합니다. 기본값은 `30`입니다.
`use_customize_layer`	부울	기본값은 `false`입니다. 의사결정 트리를 분할할 지점으로 특정 필드를 지정하려는 경우 이 특성을 `true` 로 설정할 수 있습니다.
`customize_layer`	목록	이 특성은 `use_customize_layer` 가 `true`로 설정된 경우에만 사용됩니다. 이 특성은 오브젝트의 목록입니다. 각 오브젝트에는 두 개의 속성이 있습니다. `Layer` 는 의사결정 트리에서 사용자 정의할 특정 n번째 계층을 표시하는 정수입니다. SPSS Modeler에서 레이어는 `0` (루트) 에서 시작합니다. `Fields` 은 이름 목록입니다. 각 이름은 해당 `Layer`에 대해 의사결정 트리를 잠재적으로 분할할 필드 중 하나입니다. 이러한 필드는 SPSS Modeler 에 의해 나열된 순서대로 평가됩니다. SPSS Modeler 플로우가 실행될 때 CHAID 알고리즘은 각 레이어의 `p` 값을 기반으로 분할할 필드의 후보 목록을 평가하고 리턴합니다. 사용자 정의 레이어의 경우 레이어에 대해 지정한 각 필드는 필드의 전체 후보 목록과 비교됩니다. 후보 목록의 필드와 일치하는 첫 번째 필드가 분할에 사용됩니다. 지정된 나머지 필드는 무시됩니다. 일치하는 필드가 없으면 경고 메시지가 표시되고 트리가 정상으로 분할됩니다.

주제가 도움이 되었습니까?

0/1000

예Copy link to section

예