|   | 
                
      | 
 
      | 
 
      | 
 
    
        
       
	   
       
	   
       
	   
	   
       
	   
       
	   
       
       Last update: February 19, 2014
  
     | 
 
  | 
                  | 
                
                    
                    
                          | 
                     
                    
                          | 
                     
                    
                          | 
                     
                    
                        | 
                            
						    
						     Delete Records 
                            
                            GeneXproTools allows you to 
							delete records both from the training and validation/test datasets. 
                            Record deletion is important because any dataset can have all sorts of 
							outliers, resulting 
                            not only from error introduced during data collection but also due to intrinsic noise in 
                            the data such as noise from measuring instruments.
							
  
	                        
	                        
	                        
							GeneXproTools can help you detect these outliers using different analyses and visualization tools. 
							For example, you can easily detect outliers in all variables with the help of adjustable 
							standard deviation lines (1, 2 and 3 sigmas) in the 
							Sequential Distribution Chart. Moreover, GeneXproTools also allows you to copy the 
							indexes of all the outliers for the current variable by choosing Copy Outlier IDs (3 Sigma) 
							in the context menu. These outlier indexes can then be pasted directly into the Delete Records Window 
							for the easy removal of all outliers.
							
  
	                        
	                        
							Scatter plots are also useful for detecting outliers and GeneXproTools shows scatter plots 
							for all pairs of variables, including model outputs and derived variables.
							
  
	                        
	                        
							The Highlight Records functionality of GeneXproTools is particularly useful in classification 
							and logistic regression problems, where you can use it to detect both 
							labeling errors and 
							outliers by combining the Highlight Records functionality with different charts, for example, 
							the normalized Bivariate Line Chart with different sorting options.
							
  
	                        
	                        
							Moreover, different 
							record analyses and statistics are also available in the Data Panel 
							that can help identify outliers or errors. For example, 
							error analysis is an 
							extremely powerful tool for understanding your data and your models a little better. 
							It’s also useful to detect errors in the data, for example by comparing misclassified records 
							with different prototypes, such as the class centroids
							in classification and 
							logistic regression problems.
							
  
	                        
	                        
						    
						    
                                
						    
                            
                             
                            See Also: 
                             
						                                   
							 
                           
                            Related Tutorials: 
                             
						                                   
							 
                            
                            Related Videos: 
                             
						                                   
                             
						    
						    
                                        
                         | 
                     
                     
                 | 
                  | 
                 | 
                  |