Datagy Decision Tree Tutorial

Gilg4m3sh
7 min readDec 30, 2023

--

Introducción a los Árboles de Decisión: Un árbol de decisión es un modelo de aprendizaje automático que se utiliza para clasificación y regresión. Imagina un árbol donde cada nodo representa una “pregunta” o “decisión” basada en los datos, y las ramas son las posibles respuestas. Al final de cada rama, llegas a una hoja que da la clasificación final o predicción. Este modelo es popular por su simplicidad y facilidad de interpretación, siendo visualmente intuitivo.

Instalación y Configuración de Python y Sklearn: Para empezar, necesitas instalar Python. Puedes descargarlo desde python.org. Una vez instalado, usa el administrador de paquetes de Python (pip) para instalar Sklearn, una biblioteca de aprendizaje automático. Esto se hace abriendo la línea de comandos y escribiendo pip install scikit-learn. Asegúrate de tener también instaladas las bibliotecas numpy y pandas, fundamentales para el manejo de datos.

Conceptos Básicos de Python y Sklearn: Python es intuitivo y expresivo, lo que lo hace ideal para principiantes. Por ejemplo, una variable en Python se puede crear con x = 5. Un bucle, como un bucle for, puede ser for i in range(5): print(i), lo que imprime los números del 0 al 4. Sklearn, por otro lado, se utiliza para crear modelos de aprendizaje automático. Por ejemplo, para cargar un conjunto de datos se usa from sklearn import datasets; iris = datasets.load_iris().

Preparación de Datos: Usa pandas para manejar datos. Por ejemplo, import pandas as pd; df = pd.read_csv('tu_archivo.csv') carga un archivo CSV en un DataFrame. Para limpiar datos, podrías usar df.dropna() para eliminar filas con valores faltantes. La conversión de variables categóricas a numéricas se puede hacer con pd.get_dummies(df). Finalmente, usa from sklearn.model_selection import train_test_split; X_train, X_test, y_train, y_test = train_test_split(X, y) para dividir los datos en conjuntos de entrenamiento y prueba.

LA PARTE MAS IMPORTANTE

  1. Construcción del Modelo: Para construir un árbol de decisión en Python usando Sklearn, primero importa las clases necesarias: from sklearn.tree import DecisionTreeClassifier. Crea una instancia del modelo: modelo = DecisionTreeClassifier(). Luego, entrena el modelo con tus datos de entrenamiento: modelo.fit(X_entrenamiento, y_entrenamiento). Este proceso involucra alimentar el modelo con tus datos de entrada (X_entrenamiento) y las etiquetas correspondientes (y_entrenamiento).
  2. Evaluación del Modelo: Para evaluar la precisión del modelo, utiliza la función score: precision = modelo.score(X_prueba, y_prueba). Esto te dará una idea de cómo el modelo generaliza a nuevos datos. Además, puedes usar from sklearn.metrics import confusion_matrix para obtener una matriz de confusión, que te ayudará a entender los aciertos y errores del modelo de forma más detallada.
  3. Ajuste de Hiperparámetros: Ajustar los hiperparámetros de un árbol de decisión es crucial para mejorar su rendimiento. Uno de los hiperparámetros más importantes es la profundidad máxima del árbol (max_depth). Puedes experimentar con diferentes valores para ver cómo afectan la precisión del modelo. Usa GridSearchCV de Sklearn para automatizar esta búsqueda. Ejemplo: from sklearn.model_selection import GridSearchCV; parametros = {'max_depth': [3, 5, 10]}; busqueda = GridSearchCV(modelo, parametros); busqueda.fit(X_entrenamiento, y_entrenamiento).
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier, plot_tree
import matplotlib.pyplot as plt
import seaborn as sns

# Carga de datos
datos = pd.read_csv('titanic.csv')

# Preprocesamiento
datos = datos.dropna(subset=['Age', 'Fare', 'Sex', 'Embarked'])
datos = pd.get_dummies(datos, columns=['Sex', 'Embarked'])

# Datos para el modelo
X = datos[['Age', 'Fare', 'Sex_male']]
y = datos['Survived']

# División en entrenamiento y prueba
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Construcción y entrenamiento del modelo
modelo = DecisionTreeClassifier()
modelo.fit(X_train, y_train)

# Visualización del Árbol de Decisión
plt.figure(figsize=(20,10))
plot_tree(modelo, filled=True)
plt.show()

# Visualización de Datos
sns.pairplot(datos[['Age', 'Fare', 'Sex_male', 'Survived']], hue='Survived')
plt.show()
PassengerId,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked,Survived
1,"Smith, Mr. John",male,35,0,0,24160,13,,S,0
2,"Johnson, Miss. Elizabeth",female,18,1,0,347082,7.75,,S,1
3,"Williams, Mr. Charles",male,27,0,0,349215,8.05,,S,0
4,"Brown, Mrs. Margaret",female,45,1,1,113783,52.5,C123,S,1
5,"Davis, Mr. Richard",male,32,0,0,237736,30,,C,0
6,"Miller, Mr. James",male,22,0,0,349234,7.8958,,S,0
7,"Wilson, Mrs. Emily",female,33,0,1,PC 17599,71.2833,C85,C,1
8,"Moore, Miss. Sophia",female,14,1,0,231919,30,,S,1
9,"Taylor, Miss. Anne",female,4,1,1,349909,21.075,,S,1
10,"Smith, Mr. John",male,35,0,0,24160,13,,S,0
11,"Johnson, Miss. Elizabeth",female,18,1,0,347082,7.75,,S,1
12,"Williams, Mr. Charles",male,27,0,0,349215,8.05,,S,0
13,"Brown, Mrs. Margaret",female,45,1,1,113783,52.5,C123,S,1
14,"Davis, Mr. Richard",male,32,0,0,237736,30,,C,0
15,"Miller, Mr. James",male,22,0,0,349234,7.8958,,S,0
16,"Wilson, Mrs. Emily",female,33,0,1,PC 17599,71.2833,C85,C,1
17,"Moore, Miss. Sophia",female,14,1,0,231919,30,,S,1
18,"Taylor, Miss. Anne",female,4,1,1,349909,21.075,,S,1
19,"Anderson, Mr. John",male,48,0,0,113803,53.1,C123,S,1
20,"Martinez, Mr. Carlos",male,29,0,0,345678,7.8958,,S,0
21,"Garcia, Miss. Maria",female,21,0,0,234567,7.75,,S,1
22,"Smith, Mr. William",male,50,0,0,349245,7.8958,,S,0
23,"Johnson, Mrs. Susan",female,32,1,1,234567,45.5,,S,1
24,"Williams, Miss. Laura",female,19,1,0,349237,7.8958,,S,1
25,"Brown, Mr. David",male,28,0,0,237745,30,,C,0
26,"Davis, Mrs. Linda",female,29,1,1,PC 17600,71.2833,C85,C,1
27,"Miller, Mr. Michael",male,42,0,0,349246,7.8958,,S,0
28,"Wilson, Miss. Sarah",female,25,0,0,349247,7.75,,S,1
29,"Moore, Mr. Robert",male,36,0,0,349248,7.8958,,S,0
30,"Smith, Mr. John",male,35,0,0,24160,13,,S,0
31,"Johnson, Miss. Elizabeth",female,18,1,0,347082,7.75,,S,1
32,"Williams, Mr. Charles",male,27,0,0,349215,8.05,,S,0
33,"Brown, Mrs. Margaret",female,45,1,1,113783,52.5,C123,S,1
34,"Davis, Mr. Richard",male,32,0,0,237736,30,,C,0
35,"Miller, Mr. James",male,22,0,0,349234,7.8958,,S,0
36,"Wilson, Mrs. Emily",female,33,0,1,PC 17599,71.2833,C85,C,1
37,"Moore, Miss. Sophia",female,14,1,0,231919,30,,S,1
38,"Taylor, Miss. Anne",female,4,1,1,349909,21.075,,S,1
39,"Anderson, Mr. John",male,48,0,0,113803,53.1,C123,S,1
40,"Martinez, Mr. Carlos",male,29,0,0,345678,7.8958,,S,0
41,"Garcia, Miss. Maria",female,21,0,0,234567,7.75,,S,1
42,"Smith, Mr. William",male,50,0,0,349245,7.8958,,S,0
43,"Johnson, Mrs. Susan",female,32,1,1,234567,45.5,,S,1
44,"Williams, Miss. Laura",female,19,1,0,349237,7.8958,,S,1
45,"Brown, Mr. David",male,28,0,0,237745,30,,C,0
46,"Davis, Mrs. Linda",female,29,1,1,PC 17600,71.2833,C85,C,1
47,"Miller, Mr. Michael",male,42,0,0,349246,7.8958,,S,0
48,"Wilson, Miss. Sarah",female,25,0,0,349247,7.75,,S,1
49,"Moore, Mr. Robert",male,36,0,0,349248,7.8958,,S,0
50,"Taylor, Mr. Joseph",male,55,0,0,349249,7.75,,S,0
51,"Smith, Mr. John",male,35,0,0,24160,13,,S,0
52,"Johnson, Miss. Elizabeth",female,18,1,0,347082,7.75,,S,1
53,"Williams, Mr. Charles",male,27,0,0,349215,8.05,,S,0
54,"Brown, Mrs. Margaret",female,45,1,1,113783,52.5,C123,S,1
55,"Davis, Mr. Richard",male,32,0,0,237736,30,,C,0
56,"Miller, Mr. James",male,22,0,0,349234,7.8958,,S,0
57,"Wilson, Mrs. Emily",female,33,0,1,PC 17599,71.2833,C85,C,1
58,"Moore, Miss. Sophia",female,14,1,0,231919,30,,S,1
59,"Taylor, Miss. Anne",female,4,1,1,349909,21.075,,S,1
60,"Smith, Mr. John",male,35,0,0,24160,13,,S,0
61,"Johnson, Miss. Elizabeth",female,18,1,0,347082,7.75,,S,1
62,"Williams, Mr. Charles",male,27,0,0,349215,8.05,,S,0
63,"Brown, Mrs. Margaret",female,45,1,1,113783,52.5,C123,S,1
64,"Davis, Mr. Richard",male,32,0,0,237736,30,,C,0
65,"Miller, Mr. James",male,22,0,0,349234,7.8958,,S,0
66,"Wilson, Mrs. Emily",female,33,0,1,PC 17599,71.2833,C85,C,1
67,"Moore, Miss. Sophia",female,14,1,0,231919,30,,S,1
68,"Taylor, Miss. Anne",female,4,1,1,349909,21.075,,S,1
69,"Anderson, Mr. John",male,48,0,0,113803,53.1,C123,S,1
70,"Martinez, Mr. Carlos",male,29,0,0,345678,7.8958,,S,0
71,"Garcia, Miss. Maria",female,21,0,0,234567,7.75,,S,1
72,"Smith, Mr. William",male,50,0,0,349245,7.8958,,S,0
73,"Johnson, Mrs. Susan",female,32,1,1,234567,45.5,,S,1
74,"Williams, Miss. Laura",female,19,1,0,349237,7.8958,,S,1
75,"Brown, Mr. David",male,28,0,0,237745,30,,C,0
76,"Davis, Mrs. Linda",female,29,1,1,PC 17600,71.2833,C85,C,1
77,"Miller, Mr. Michael",male,42,0,0,349246,7.8958,,S,0
78,"Wilson, Miss. Sarah",female,25,0,0,349247,7.75,,S,1
79,"Moore, Mr. Robert",male,36,0,0,349248,7.8958,,S,0
80,"Taylor, Mr. Joseph",male,55,0,0,349249,7.75,,S,0
81,"Smith, Mr. John",male,35,0,0,24160,13,,S,0
82,"Johnson, Miss. Elizabeth",female,18,1,0,347082,7.75,,S,1
83,"Williams, Mr. Charles",male,27,0,0,349215,8.05,,S,0
84,"Brown, Mrs. Margaret",female,45,1,1,113783,52.5,C123,S,1
85,"Davis, Mr. Richard",male,32,0,0,237736,30,,C,0
86,"Miller, Mr. James",male,22,0,0,349234,7.8958,,S,0
87,"Wilson, Mrs. Emily",female,33,0,1,PC 17599,71.2833,C85,C,1
88,"Moore, Miss. Sophia",female,14,1,0,231919,30,,S,1
89,"Taylor, Miss. Anne",female,4,1,1,349909,21.075,,S,1
90,"Smith, Mr. John",male,35,0,0,24160,13,,S,0
91,"Johnson, Miss. Elizabeth",female,18,1,0,347082,7.75,,S,1
92,"Williams, Mr. Charles",male,27,0,0,349215,8.05,,S,0
93,"Brown, Mrs. Margaret",female,45,1,1,113783,52.5,C123,S,1
94,"Davis, Mr. Richard",male,32,0,0,237736,30,,C,0
95,"Miller, Mr. James",male,22,0,0,349234,7.8958,,S,0
96,"Wilson, Mrs. Emily",female,33,0,1,PC 17599,71.2833,C85,C,1
97,"Moore, Miss. Sophia",female,14,1,0,231919,30,,S,1
98,"Taylor, Miss. Anne",female,4,1,1,349909,21.075,,S,1
99,"Anderson, Mr. John",male,48,0,0,113803,53.1,C123,S,1
100,"Martinez, Mr. Carlos",male,29,0,0,345678,7.8958,,S,0
101,"Garcia, Miss. Maria",female,21,0,0,234567,7.75,,S,1
102,"Smith, Mr. William",male,50,0,0,349245,7.8958,,S,0
103,"Johnson, Mrs. Susan",female,32,1,1,234567,45.5,,S,1
104,"Williams, Miss. Laura",female,19,1,0,349237,7.8958,,S,1
105,"Brown, Mr. David",male,28,0,0,237745,30,,C,0
106,"Davis, Mrs. Linda",female,29,1,1,PC 17600,71.2833,C85,C,1
107,"Miller, Mr. Michael",male,42,0,0,349246,7.8958,,S,0
108,"Wilson, Miss. Sarah",female,25,0,0,349247,7.75,,S,1
109,"Moore, Mr. Robert",male,36,0,0,349248,7.8958,,S,0
110,"Taylor, Mr. Joseph",male,55,0,0,349249,7.75,,S,0
111,"Smith, Mr. John",male,35,0,0,24160,13,,S,0
112,"Johnson, Miss. Elizabeth",female,18,1,0,347082,7.75,,S,1
113,"Williams, Mr. Charles",male,27,0,0,349215,8.05,,S,0
114,"Brown, Mrs. Margaret",female,45,1,1,113783,52.5,C123,S,1
115,"Davis, Mr. Richard",male,32,0,0,237736,30,,C,0
116,"Miller, Mr. James",male,22,0,0,349234,7.8958,,S,0
117,"Wilson, Mrs. Emily",female,33,0,1,PC 17599,71.2833,C85,C,1
118,"Moore, Miss. Sophia",female,14,1,0,231919,30,,S,1
119,"Taylor, Miss. Anne",female,4,1,1,349909,21.075,,S,1
120,"Anderson, Mr. John",male,48,0,0,113803,53.1,C123,S,1
121,"Martinez, Mr. Carlos",male,29,0,0,345678,7.8958,,S,0
122,"Garcia, Miss. Maria",female,21,0,0,234567,7.75,,S,1
123,"Smith, Mr. William",male,50,0,0,349245,7.8958,,S,0
124,"Johnson, Mrs. Susan",female,32,1,1,234567,45.5,,S,1
125,"Williams, Miss. Laura",female,19,1,0,349237,7.8958,,S,1
126,"Brown, Mr. David",male,28,0,0,237745,30,,C,0
127,"Davis, Mrs. Linda",female,29,1,1,PC 17600,71.2833,C85,C,1
128,"Miller, Mr. Michael",male,42,0,0,349246,7.8958,,S,0
129,"Wilson, Miss. Sarah",female,25,0,0,349247,7.75,,S,1
130,"Smith, Mr. John",male,35,0,0,24160,13,,S,0
131,"Johnson, Miss. Elizabeth",female,18,1,0,347082,7.75,,S,1
132,"Williams, Mr. Charles",male,27,0,0,349215,8.05,,S,0
133,"Brown, Mrs. Margaret",female,45,1,1,113783,52.5,C123,S,1
134,"Davis, Mr. Richard",male,32,0,0,237736,30,,C,0
135,"Miller, Mr. James",male,22,0,0,349234,7.8958,,S,0
136,"Wilson, Mrs. Emily",female,33,0,1,PC 17599,71.2833,C85,C,1
137,"Moore, Miss. Sophia",female,14,1,0,231919,30,,S,1
138,"Taylor, Miss. Anne",female,4,1,1,349909,21.075,,S,1
139,"Anderson, Mr. John",male,48,0,0,113803,53.1,C123,S,1
140,"Martinez, Mr. Carlos",male,29,0,0,345678,7.8958,,S,0
141,"Garcia, Miss. Maria",female,21,0,0,234567,7.75,,S,1
142,"Smith, Mr. William",male,50,0,0,349245,7.8958,,S,0
143,"Johnson, Mrs. Susan",female,32,1,1,234567,45.5,,S,1
144,"Williams, Miss. Laura",female,19,1,0,349237,7.8958,,S,1
145,"Brown, Mr. David",male,28,0,0,237745,30,,C,0
146,"Davis, Mrs. Linda",female,29,1,1,PC 17600,71.2833,C85,C,1
147,"Miller, Mr. Michael",male,42,0,0,349246,7.8958,,S,0
148,"Wilson, Miss. Sarah",female,25,0,0,349247,7.75,,S,1
149,"Moore, Mr. Robert",male,36,0,0,349248,7.8958,,S,0
150,"Taylor, Mr. Joseph",male,55,0,0,349249,7.75,,S,0
151,"Smith, Mr. John",male,35,0,0,24160,13,,S,0
152,"Johnson, Miss. Elizabeth",female,18,1,0,347082,7.75,,S,1
153,"Williams, Mr. Charles",male,27,0,0,349215,8.05,,S,0
154,"Brown, Mrs. Margaret",female,45,1,1,113783,52.5,C123,S,1
155,"Davis, Mr. Richard",male,32,0,0,237736,30,,C,0
156,"Miller, Mr. James",male,22,0,0,349234,7.8958,,S,0
157,"Wilson, Mrs. Emily",female,33,0,1,PC 17599,71.2833,C85,C,1
158,"Moore, Miss. Sophia",female,14,1,0,231919,30,,S,1
159,"Taylor, Miss. Anne",female,4,1,1,349909,21.075,,S,1
160,"Anderson, Mr. John",male,48,0,0,113803,53.1,C123,S,1
161,"Martinez, Mr. Carlos",male,29,0,0,345678,7.8958,,S,0
162,"Garcia, Miss. Maria",female,21,0,0,234567,7.75,,S,1
163,"Smith, Mr. William",male,50,0,0,349245,7.8958,,S,0
164,"Johnson, Mrs. Susan",female,32,1,1,234567,45.5,,S,1
165,"Williams, Miss. Laura",female,19,1,0,349237,7.8958,,S,1
166,"Brown, Mr. David",male,28,0,0,237745,30,,C,0
167,"Davis, Mrs. Linda",female,29,1,1,PC 17600,71.2833,C85,C,1
168,"Miller, Mr. Michael",male,42,0,0,349246,7.8958,,S,0
169,"Wilson, Miss. Sarah",female,25,0,0,349247,7.75,,S,1
170,"Moore, Mr. Robert",male,36,0,0,349248,7.8958,,S,0
171,"Taylor, Mr. Joseph",male,55,0,0,349249,7.75,,S,0
172,"Smith, Mr. John",male,35,0,0,24160,13,,S,0
173,"Johnson, Miss. Elizabeth",female,18,1,0,347082,7.75,,S,1
174,"Williams, Mr. Charles",male,27,0,0,349215,8.05,,S,0
175,"Brown, Mrs. Margaret",female,45,1,1,113783,52.5,C123,S,1
176,"Davis, Mr. Richard",male,32,0,0,237736,30,,C,0
177,"Miller, Mr. James",male,22,0,0,349234,7.8958,,S,0
178,"Wilson, Mrs. Emily",female,33,0,1,PC 17599,71.2833,C85,C,1
179,"Moore, Miss. Sophia",female,14,1,0,231919,30,,S,1
180,"Taylor, Miss. Anne",female,4,1,1,349909,21.075,,S,1
181,"Anderson, Mr. John",male,48,0,0,113803,53.1,C123,S,1
182,"Martinez, Mr. Carlos",male,29,0,0,345678,7.8958,,S,0
183,"Garcia, Miss. Maria",female,21,0,0,234567,7.75,,S,1
184,"Smith, Mr. William",male,50,0,0,349245,7.8958,,S,0
185,"Johnson, Mrs. Susan",female,32,1,1,234567,45.5,,S,1
186,"Williams, Miss. Laura",female,19,1,0,349237,7.8958,,S,1
187,"Brown, Mr. David",male,28,0,0,237745,30,,C,0
188,"Davis, Mrs. Linda",female,29,1,1,PC 17600,71.2833,C85,C,1
189,"Miller, Mr. Michael",male,42,0,0,349246,7.8958,,S,0
190,"Wilson, Miss. Sarah",female,25,0,0,349247,7.75,,S,1
191,"Moore, Mr. Robert",male,36,0,0,349248,7.8958,,S,0
192,"Taylor, Mr. Joseph",male,55,0,0,349249,7.75,,S,0
193,"Smith, Mr. John",male,35,0,0,24160,13,,S,0
194,"Johnson, Miss. Elizabeth",female,18,1,0,347082,7.75,,S,1
195,"Williams, Mr. Charles",male,27,0,0,349215,8.05,,S,0
196,"Brown, Mrs. Margaret",female,45,1,1,113783,52.5,C123,S,1
197,"Davis, Mr. Richard",male,32,0,0,237736,30,,C,0
198,"Miller, Mr. James",male,22,0,0,349234,7.8958,,S,0
199,"Wilson, Mrs. Emily",female,33,0,1,PC 17599,71.2833,C85,C,1
200,"Moore, Miss. Sophia",female,14,1,0,231919,30,,S,1S,0
201,"Smith, Mr. John",male,35,0,0,24160,13,,S,0
202,"Johnson, Miss. Elizabeth",female,18,1,0,347082,7.75,,S,1
203,"Williams, Mr. Charles",male,27,0,0,349215,8.05,,S,0
204,"Brown, Mrs. Margaret",female,45,1,1,113783,52.5,C123,S,1
205,"Davis, Mr. Richard",male,32,0,0,237736,30,,C,0
206,"Miller, Mr. James",male,22,0,0,349234,7.8958,,S,0
207,"Wilson, Mrs. Emily",female,33,0,1,PC 17599,71.2833,C85,C,1
208,"Moore, Miss. Sophia",female,14,1,0,231919,30,,S,1
209,"Taylor, Miss. Anne",female,4,1,1,349909,21.075,,S,1
210,"Anderson, Mr. John",male,48,0,0,113803,53.1,C123,S,1
211,"Martinez, Mr. Carlos",male,29,0,0,345678,7.8958,,S,0
212,"Garcia, Miss. Maria",female,21,0,0,234567,7.75,,S,1
213,"Smith, Mr. William",male,50,0,0,349245,7.8958,,S,0
214,"Johnson, Mrs. Susan",female,32,1,1,234567,45.5,,S,1
215,"Williams, Miss. Laura",female,19,1,0,349237,7.8958,,S,1
216,"Brown, Mr. David",male,28,0,0,237745,30,,C,0
217,"Davis, Mrs. Linda",female,29,1,1,PC 17600,71.2833,C85,C,1
218,"Miller, Mr. Michael",male,42,0,0,349246,7.8958,,S,0
219,"Wilson, Miss. Sarah",female,25,0,0,349247,7.75,,S,1
220,"Moore, Mr. Robert",male,36,0,0,349248,7.8958,,S,0
221,"Taylor, Mr. Joseph",male,55,0,0,349249,7.75,,S,0
222,"Smith, Mr. John",male,35,0,0,24160,13,,S,0
223,"Johnson, Miss. Elizabeth",female,18,1,0,347082,7.75,,S,1
224,"Williams, Mr. Charles",male,27,0,0,349215,8.05,,S,0
225,"Brown, Mrs. Margaret",female,45,1,1,113783,52.5,C123,S,1
226,"Davis, Mr. Richard",male,32,0,0,237736,30,,C,0
227,"Miller, Mr. James",male,22,0,0,349234,7.8958,,S,0
228,"Wilson, Mrs. Emily",female,33,0,1,PC 17599,71.2833,C85,C,1
229,"Moore, Miss. Sophia",female,14,1,0,231919,30,,S,1
230,"Taylor, Miss. Anne",female,4,1,1,349909,21.075,,S,1
231,"Anderson, Mr. John",male,48,0,0,113803,53.1,C123,S,1
232,"Martinez, Mr. Carlos",male,29,0,0,345678,7.8958,,S,0
233,"Garcia, Miss. Maria",female,21,0,0,234567,7.75,,S,1
234,"Smith, Mr. William",male,50,0,0,349245,7.8958,,S,0
235,"Johnson, Mrs. Susan",female,32,1,1,234567,45.5,,S,1
236,"Williams, Miss. Laura",female,19,1,0,349237,7.8958,,S,1
237,"Brown, Mr. David",male,28,0,0,237745,30,,C,0
238,"Davis, Mrs. Linda",female,29,1,1,PC 17600,71.2833,C85,C,1
239,"Miller, Mr. Michael",male,42,0,0,349246,7.8958,,S,0
240,"Wilson, Miss. Sarah",female,25,0,0,349247,7.75,,S,1
241,"Moore, Mr. Robert",male,36,0,0,349248,7.8958,,S,0
242,"Taylor, Mr. Joseph",male,55,0,0,349249,7.75,,S,0
243,"Smith, Mr. John",male,35,0,0,24160,13,,S,0
244,"Johnson, Miss. Elizabeth",female,18,1,0,347082,7.75,,S,1
245,"Williams, Mr. Charles",male,27,0,0,349215,8.05,,S,0
246,"Brown, Mrs. Margaret",female,45,1,1,113783,52.5,C123,S,1
247,"Davis, Mr. Richard",male,32,0,0,237736,30,,C,0
248,"Miller, Mr. James",male,22,0,0,349234,7.8958,,S,0
249,"Wilson, Mrs. Emily",female,33,0,1,PC 17599,71.2833,C85,C,1
250,"Moore, Miss. Sophia",female,14,1,0,231919,30,,S,1
251,"Taylor, Miss. Anne",female,4,1,1,349909,21.075,,S,1
252,"Anderson, Mr. John",male,48,0,0,113803,53.1,C123,S,1
253,"Martinez, Mr. Carlos",male,29,0,0,345678,7.8958,,S,0
254,"Garcia, Miss. Maria",female,21,0,0,234567,7.75,,S,1
255,"Smith, Mr. William",male,50,0,0,349245,7.8958,,S,0
256,"Johnson, Mrs. Susan",female,32,1,1,234567,45.5,,S,1
257,"Williams, Miss. Laura",female,19,1,0,349237,7.8958,,S,1
258,"Brown, Mr. David",male,28,0,0,237745,30,,C,0
259,"Davis, Mrs. Linda",female,29,1,1,PC 17600,71.2833,C85,C,1
260,"Miller, Mr. Michael",male,42,0,0,349246,7.8958,,S,0
261,"Wilson, Miss. Sarah",female,25,0,0,349247,7.75,,S,1
262,"Moore, Mr. Robert",male,36,0,0,349248,7.8958,,S,0
263,"Taylor, Mr. Joseph",male,55,0,0,349249,7.75,,S,0
264,"Smith, Mr. John",male,35,0,0,24160,13,,S,0
265,"Johnson, Miss. Elizabeth",female,18,1,0,347082,7.75,,S,1
266,"Williams, Mr. Charles",male,27,0,0,349215,8.05,,S,0
267,"Brown, Mrs. Margaret",female,45,1,1,113783,52.5,C123,S,1
268,"Davis, Mr. Richard",male,32,0,0,237736,30,,C,0
269,"Miller, Mr. James",male,22,0,0,349234,7.8958,,S,0
270,"Wilson, Mrs. Emily",female,33,0,1,PC 17599,71.2833,C85,C,1
271,"Moore, Miss. Sophia",female,14,1,0,231919,30,,S,1
272,"Taylor, Miss. Anne",female,4,1,1,349909,21.075,,S,1
273,"Anderson, Mr. John",male,48,0,0,113803,53.1,C123,S,1
274,"Martinez, Mr. Carlos",male,29,0,0,345678,7.8958,,S,0
275,"Garcia, Miss. Maria",female,21,0,0,234567,7.75,,S,1
276,"Smith, Mr. William",male,50,0,0,349245,7.8958,,S,0
277,"Johnson, Mrs. Susan",female,32,1,1,234567,45.5,,S,1
278,"Williams, Miss. Laura",female,19,1,0,349237,7.8958,,S,1
279,"Brown, Mr. David",male,28,0,0,237745,30,,C,0
280,"Davis, Mrs. Linda",female,29,1,1,PC 17600,71.2833,C85,C,1
281,"Miller, Mr. Michael",male,42,0,0,349
La ilustración presenta un fragmento de un árbol de decisión. Los nodos representan puntos donde se realiza una elección basada en una característica específica, como la edad o el costo del pasaje. El término “gini” se refiere a la pureza del nodo: cuando su valor es 0, significa que todos los ejemplos en ese nodo pertenecen a una única categoría. El número “samples” indica cuántos ejemplos cumplen con las condiciones hasta ese punto, y “value” muestra cómo se distribuyen las categorías en ese nodo. Por ejemplo, si un nodo tiene “value = [0, 14]”, significa que contiene 14 ejemplos de una categoría y ninguno de la otra. Esto proporciona información sobre cómo el modelo realiza predicciones utilizando los datos disponibles.
Este es el conjunto de gráficos de pares generados con la biblioteca Seaborn en Python. Estos gráficos muestran la distribución y las relaciones entre diferentes variables del dataset del Titanic: edad (Age), tarifa del billete (Fare) y género codificado (Sex_male), con colores que distinguen entre los que sobrevivieron (1.0) y los que no (0.0). Los gráficos diagonales muestran la distribución de una única variable, mientras que los gráficos fuera de la diagonal muestran la relación entre dos variables. Por ejemplo, se puede observar cómo se distribuyen las edades en relación con la tarifa y la supervivencia, o cómo la tarifa se relaciona con el género y la supervivencia. Estos gráficos son útiles para identificar patrones y posibles correlaciones entre las características de los pasajeros y su supervivencia en el Titanic.

A través de este tutorial, hemos explorado los conceptos básicos de los árboles de decisión en Sklearn, desde la preparación de datos hasta el ajuste de hiperparámetros. Con una comprensión clara de cada paso y la práctica con ejemplos, puedes aplicar estos conocimientos para construir y afinar tus propios modelos de aprendizaje automático.

--

--

Gilg4m3sh
Gilg4m3sh

Written by Gilg4m3sh

Exploring the intersection of creativity, technology, and personal growth. I write about AI, mental health, gaming, and self-care. https://ko-fi.com/gilg4m3sh_

No responses yet