Overview

Dataset info

Number of variables16
Number of observations2051
Missing cells674 (2.1%)
Duplicate rows0 (0.0%)
Total size in memory256.5 KiB
Average record size in memory128.1 B

Variables types

Numeric8
Categorical6
Boolean0
Date0
URL0
Text (Unique)0
Rejected2
Unsupported0

Warnings

depth is highly skewed (γ1 = -28.37020322) Skewed
depth has 1726 (84.2%) zeros Zeros
id_no is highly correlated with date_long (ρ = 0.9999959739) Rejected
latitude has 24 (1.2%) zeros Zeros
longitude has 24 (1.2%) zeros Zeros
magnitude_body has 1217 (59.3%) zeros Zeros
magnitude_surface has 1883 (91.8%) zeros Zeros
name has a high cardinality: 1309 distinct values Warning
name has 665 (32.4%) missing values Missing
region has a high cardinality: 79 distinct values Warning
year is highly correlated with id_no (ρ = 0.9999959812) Rejected
yield_lower has 812 (39.6%) zeros Zeros
yield_upper has 36 (1.8%) zeros Zeros

Variables

country
Categorical

Distinct count7
Unique (%)0.3%
Missing (%)0.0%
Missing (n)0
USA
1032
USSR
714
FRANCE
210
Other values (4)
 
95
ValueCountFrequency (%) 
USA 1032 50.3%
 
USSR 714 34.8%
 
FRANCE 210 10.2%
 
CHINA 45 2.2%
 
UK 45 2.2%
 
INDIA 3 0.1%
 
PAKIST 2 0.1%
 
Max length6
Mean length3.683081424
Min length2
Contains charsTrue
Contains digitsFalse
Contains spacesFalse
Contains non-wordsFalse

date_long
Numeric

Distinct count1756
Unique (%)85.6%
Missing (%)0.0%
Missing (n)0
Infinite (%)0.0%
Infinite (n)0
Mean19709735.64
Minimum19450716
Maximum19980530
Zeros (%)0.0%
Mini histogram

Quantile statistics

Minimum19450716
5-th percentile19560523.5
Q119621066
Median19700501
Q319790920
95-th percentile19880826.5
Maximum19980530
Range529814
Interquartile range169854

Descriptive statistics

Standard deviation103675.7084
Coef of variation0.005260126786
Kurtosis-0.8312427352
Mean19709735.64
MAD87936.67656
Skewness0.1676987827
Sum4.042466779e+10
Variance1.07486525e+10
Memory size16.1 KiB
Histogram
Histogram with fixed size bins (bins=50)
Histogram
Histogram with variable size bins (bins=[19450716. 19450807. 19510127.5 19511124. 19520408. ... 19920924. 19950666. 19951174. 19980512. 19980530. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
19830924 6 0.3%
 
19581022 5 0.2%
 
19621223 4 0.2%
 
19841027 4 0.2%
 
19611104 4 0.2%
 
19780927 4 0.2%
 
19581030 4 0.2%
 
19621224 4 0.2%
 
19621101 4 0.2%
 
19821016 4 0.2%
 
Other values (1746) 2008 97.9%
 

Minimum 5 values

ValueCountFrequency (%) 
19450716 1 < 0.1%
 
19450805 1 < 0.1%
 
19450809 1 < 0.1%
 
19460630 1 < 0.1%
 
19460724 1 < 0.1%
 

Maximum 5 values

ValueCountFrequency (%) 
19980530 1 < 0.1%
 
19980528 1 < 0.1%
 
19980513 1 < 0.1%
 
19980511 1 < 0.1%
 
19960729 1 < 0.1%
 

depth
Numeric

Distinct count137
Unique (%)6.7%
Missing (%)0.0%
Missing (n)0
Infinite (%)0.0%
Infinite (n)0
Mean-0.489632862
Minimum-400
Maximum1.451
Zeros (%)84.2%
Mini histogram

Quantile statistics

Minimum-400
5-th percentile-0.001
Q10
Median0
Q30
95-th percentile0.366
Maximum1.451
Range401.451
Interquartile range0

Descriptive statistics

Standard deviation10.96769872
Coef of variation-22.39984194
Kurtosis925.8633987
Mean-0.489632862
MAD1.029996113
Skewness-28.37020322
Sum-1004.237
Variance120.2904152
Memory size16.1 KiB
Histogram
Histogram with fixed size bins (bins=50)
Histogram
Histogram with variable size bins (bins=[-4.000e+02 -1.750e+00 -5.500e-01 -2.250e-01 -1.750e-01 ... 3.930e-01 6.385e-01 6.425e-01 7.240e-01 1.451e+00], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 1726 84.2%
 
-0.001 54 2.6%
 
-0.2 22 1.1%
 
-0.1 21 1.0%
 
-0.5 15 0.7%
 
0.64 11 0.5%
 
0.2 7 0.3%
 
0.3 7 0.3%
 
-0.25 6 0.3%
 
0.32 6 0.3%
 
Other values (127) 176 8.6%
 

Minimum 5 values

ValueCountFrequency (%) 
-400 1 < 0.1%
 
-160 3 0.1%
 
-85 1 < 0.1%
 
-45 1 < 0.1%
 
-28 1 < 0.1%
 

Maximum 5 values

ValueCountFrequency (%) 
1.451 1 < 0.1%
 
1.311 1 < 0.1%
 
1.265 1 < 0.1%
 
1.237 1 < 0.1%
 
1.219 1 < 0.1%
 

id_no
Highly correlated

This variable is highly correlated with date_long and should be ignored for analysis

Correlation0.9999959739

latitude
Numeric

Distinct count527
Unique (%)25.7%
Missing (%)0.0%
Missing (n)0
Infinite (%)0.0%
Infinite (n)0
Mean35.39733057
Minimum-49.5
Maximum75.1
Zeros (%)1.2%
Mini histogram

Quantile statistics

Minimum-49.5
5-th percentile-22
Q137
Median37.1
Q349.87
95-th percentile73
Maximum75.1
Range124.6
Interquartile range12.87

Descriptive statistics

Standard deviation23.40365418
Coef of variation0.6611700319
Kurtosis1.457130721
Mean35.39733057
MAD15.24492637
Skewness-1.256214377
Sum72599.925
Variance547.7310292
Memory size16.1 KiB
Histogram
Histogram with fixed size bins (bins=50)
Histogram
Histogram with variable size bins (bins=[-49.5 -22.37 -22.03 -21.9955 -21.925 ... 73.1 73.2945 73.405 74.35 75.1 ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
37 448 21.8%
 
50 201 9.8%
 
-22 103 5.0%
 
73 63 3.1%
 
37.1 62 3.0%
 
11.3 43 2.1%
 
2 30 1.5%
 
37.2 27 1.3%
 
0 24 1.2%
 
11.35 23 1.1%
 
Other values (517) 1027 50.1%
 

Minimum 5 values

ValueCountFrequency (%) 
-49.5 1 < 0.1%
 
-48.5 1 < 0.1%
 
-38.5 1 < 0.1%
 
-30 7 0.3%
 
-28.7 2 0.1%
 

Maximum 5 values

ValueCountFrequency (%) 
75.1 1 < 0.1%
 
74.7 1 < 0.1%
 
74.6 1 < 0.1%
 
74.4 1 < 0.1%
 
74.3 4 0.2%
 

longitude
Numeric

Distinct count575
Unique (%)28.0%
Missing (%)0.0%
Missing (n)0
Infinite (%)0.0%
Infinite (n)0
Mean-36.04877718
Minimum-169.32
Maximum179.22
Zeros (%)1.2%
Mini histogram

Quantile statistics

Minimum-169.32
5-th percentile-139
Q1-116.054
Median-116
Q378
95-th percentile90.485
Maximum179.22
Range348.54
Interquartile range194.054

Descriptive statistics

Standard deviation100.8643401
Coef of variation-2.797996158
Kurtosis-1.560713842
Mean-36.04877718
MAD97.09920356
Skewness0.40821729
Sum-73936.042
Variance10173.6151
Memory size16.1 KiB
Histogram
Histogram with fixed size bins (bins=50)
Histogram
Histogram with variable size bins (bins=[-169.32 -156. -139.3 -139.07 -139.0045 ... 132.31 147.25 172.15 179.145 179.22 ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
-116 468 22.8%
 
78 185 9.0%
 
-139 108 5.3%
 
55 66 3.2%
 
162.15 43 2.1%
 
-116.1 41 2.0%
 
-157 30 1.5%
 
0 24 1.2%
 
165.2 23 1.1%
 
-116.03 22 1.1%
 
Other values (565) 1041 50.8%
 

Minimum 5 values

ValueCountFrequency (%) 
-169.32 12 0.6%
 
-157 30 1.5%
 
-155 3 0.1%
 
-149.417 1 < 0.1%
 
-140.5 1 < 0.1%
 

Maximum 5 values

ValueCountFrequency (%) 
179.22 1 < 0.1%
 
179.19 1 < 0.1%
 
179.1 1 < 0.1%
 
165.2 23 1.1%
 
163.017 1 < 0.1%
 

magnitude_body
Numeric

Distinct count43
Unique (%)2.1%
Missing (%)0.0%
Missing (n)0
Infinite (%)0.0%
Infinite (n)0
Mean2.144904924
Minimum0
Maximum7.4
Zeros (%)59.3%
Mini histogram

Quantile statistics

Minimum0
5-th percentile0
Q10
Median0
Q35.1
95-th percentile6
Maximum7.4
Range7.4
Interquartile range5.1

Descriptive statistics

Standard deviation2.624897141
Coef of variation1.223782514
Kurtosis-1.689381038
Mean2.144904924
MAD2.545440559
Skewness0.4569896575
Sum4399.2
Variance6.890085003
Memory size16.1 KiB
Histogram
Histogram with fixed size bins (bins=43)
Histogram
Histogram with variable size bins (bins=[0. 1.25 3.65 4.25 4.75 5.85 6.15 6.45 7.4 ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 1217 59.3%
 
5.2 63 3.1%
 
5.6 61 3.0%
 
5.3 60 2.9%
 
5 53 2.6%
 
5.4 50 2.4%
 
5.5 44 2.1%
 
4.9 43 2.1%
 
4.8 41 2.0%
 
5.7 38 1.9%
 
Other values (33) 381 18.6%
 

Minimum 5 values

ValueCountFrequency (%) 
0 1217 59.3%
 
2.5 1 < 0.1%
 
2.6 1 < 0.1%
 
3.3 1 < 0.1%
 
3.4 1 < 0.1%
 

Maximum 5 values

ValueCountFrequency (%) 
7.4 1 < 0.1%
 
7.3 1 < 0.1%
 
7.2 3 0.1%
 
7.1 7 0.3%
 
7 1 < 0.1%
 

magnitude_surface
Numeric

Distinct count26
Unique (%)1.3%
Missing (%)0.0%
Missing (n)0
Infinite (%)0.0%
Infinite (n)0
Mean0.3558264261
Minimum0
Maximum6
Zeros (%)91.8%
Mini histogram

Quantile statistics

Minimum0
5-th percentile0
Q10
Median0
Q30
95-th percentile4.2
Maximum6
Range6
Interquartile range0

Descriptive statistics

Standard deviation1.202229205
Coef of variation3.378695668
Kurtosis8.162365897
Mean0.3558264261
MAD0.6533604685
Skewness3.148617536
Sum729.8
Variance1.445355061
Memory size16.1 KiB
Histogram
Histogram with fixed size bins (bins=26)
Histogram
Histogram with variable size bins (bins=[0. 1.5 3.3 4.05 4.55 5.35 6. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 1883 91.8%
 
4.4 27 1.3%
 
4.2 25 1.2%
 
4.1 12 0.6%
 
4.5 12 0.6%
 
4.3 11 0.5%
 
4.7 8 0.4%
 
4 8 0.4%
 
3.8 7 0.3%
 
5.2 7 0.3%
 
Other values (16) 51 2.5%
 

Minimum 5 values

ValueCountFrequency (%) 
0 1883 91.8%
 
3 3 0.1%
 
3.1 2 0.1%
 
3.2 1 < 0.1%
 
3.4 4 0.2%
 

Maximum 5 values

ValueCountFrequency (%) 
6 1 < 0.1%
 
5.7 2 0.1%
 
5.5 4 0.2%
 
5.4 1 < 0.1%
 
5.3 5 0.2%
 

name
Categorical

Distinct count1309
Unique (%)63.8%
Missing (%)32.4%
Missing (n)665
VEGA
 
15
LIRA
 
6
GELIY
 
5
Other values (1305)
1360
(Missing)
665
ValueCountFrequency (%) 
VEGA 15 0.7%
 
LIRA 6 0.3%
 
GELIY 5 0.2%
 
REGION 5 0.2%
 
METEORIT 4 0.2%
 
EASY 4 0.2%
 
KRATON 4 0.2%
 
GORIZONT 4 0.2%
 
BAKER 4 0.2%
 
NEVA 4 0.2%
 
Other values (1298) 1331 64.9%
 
(Missing) 665 32.4%
 
Max length15
Mean length5.681131156
Min length1
Contains charsTrue
Contains digitsTrue
Contains spacesTrue
Contains non-wordsTrue

purpose
Categorical

Distinct count28
Unique (%)1.4%
Missing (%)< 0.1%
Missing (n)1
WR
1495
WE
 
181
PNE
 
153
Other values (24)
 
221
ValueCountFrequency (%) 
WR 1495 72.9%
 
WE 181 8.8%
 
PNE 153 7.5%
 
SE 71 3.5%
 
FMS 33 1.6%
 
PNE:PLO 27 1.3%
 
SAM 25 1.2%
 
WR/SE 11 0.5%
 
PNE:V 7 0.3%
 
WR/FMS 6 0.3%
 
Other values (17) 41 2.0%
 
Max length7
Mean length2.274012677
Min length2
Contains charsTrue
Contains digitsFalse
Contains spacesFalse
Contains non-wordsTrue

region
Categorical

Distinct count79
Unique (%)3.9%
Missing (%)0.0%
Missing (n)0
NTS
928
SEMI KAZAKH
455
MURUROA
 
172
Other values (76)
496
ValueCountFrequency (%) 
NTS 928 45.2%
 
SEMI KAZAKH 455 22.2%
 
MURUROA 172 8.4%
 
NZ RUSS 128 6.2%
 
LOP NOR 45 2.2%
 
ENEWETAK 43 2.1%
 
CHRISTMAS IS 30 1.5%
 
BIKINI 23 1.1%
 
ASTRAK RUSS 15 0.7%
 
AZGIR KAZAKH 13 0.6%
 
Other values (69) 199 9.7%
 
Max length12
Mean length6.500243784
Min length3
Contains charsTrue
Contains digitsTrue
Contains spacesTrue
Contains non-wordsTrue

source
Categorical

Distinct count13
Unique (%)0.6%
Missing (%)0.0%
Missing (n)0
DOE
712
ISC
546
UGS
341
Other values (10)
452
ValueCountFrequency (%) 
DOE 712 34.7%
 
ISC 546 26.6%
 
UGS 341 16.6%
 
MTM 169 8.2%
 
HFS 113 5.5%
 
WTN 93 4.5%
 
SPA 23 1.1%
 
BKY 19 0.9%
 
DIS 19 0.9%
 
NOA 10 0.5%
 
Other values (3) 6 0.3%
 
Max length3
Mean length3
Min length3
Contains charsTrue
Contains digitsFalse
Contains spacesFalse
Contains non-wordsFalse

type
Categorical

Distinct count20
Unique (%)1.0%
Missing (%)0.0%
Missing (n)0
SHAFT
1015
TUNNEL
310
ATMOSPH
 
185
Other values (17)
541
ValueCountFrequency (%) 
SHAFT 1015 49.5%
 
TUNNEL 310 15.1%
 
ATMOSPH 185 9.0%
 
SHAFT/GR 85 4.1%
 
AIRDROP 78 3.8%
 
TOWER 75 3.7%
 
SURFACE 62 3.0%
 
BALLOON 62 3.0%
 
SHAFT/LG 56 2.7%
 
BARGE 40 2.0%
 
Other values (10) 83 4.0%
 
Max length8
Mean length5.701608971
Min length2
Contains charsTrue
Contains digitsFalse
Contains spacesTrue
Contains non-wordsTrue

year
Highly correlated

This variable is highly correlated with id_no and should be ignored for analysis

Correlation0.9999959812

yield_lower
Numeric

Distinct count309
Unique (%)15.1%
Missing (%)0.1%
Missing (n)3
Infinite (%)0.0%
Infinite (n)0
Mean209.2175316
Minimum0
Maximum50000
Zeros (%)39.6%
Mini histogram

Quantile statistics

Minimum0
5-th percentile0
Q10
Median0.001
Q320
95-th percentile479
Maximum50000
Range50000
Interquartile range20

Descriptive statistics

Standard deviation1641.346929
Coef of variation7.845169173
Kurtosis466.8625581
Mean209.2175316
MAD367.9034282
Skewness18.5775356
Sum428477.5047
Variance2694019.742
Memory size16.1 KiB
Histogram
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0 812 39.6%
 
0.001 259 12.6%
 
20 243 11.8%
 
15 35 1.7%
 
8.5 25 1.2%
 
10 22 1.1%
 
200 20 1.0%
 
12 14 0.7%
 
1 14 0.7%
 
150 12 0.6%
 
Other values (298) 592 28.9%
 

Minimum 5 values

ValueCountFrequency (%) 
0 812 39.6%
 
0.0002 1 < 0.1%
 
0.0005 1 < 0.1%
 
0.0006 1 < 0.1%
 
0.0007 1 < 0.1%
 

Maximum 5 values

ValueCountFrequency (%) 
50000 1 < 0.1%
 
24200 1 < 0.1%
 
21100 1 < 0.1%
 
19100 1 < 0.1%
 
15000 1 < 0.1%
 

yield_upper
Numeric

Distinct count311
Unique (%)15.2%
Missing (%)0.2%
Missing (n)5
Infinite (%)0.0%
Infinite (n)0
Mean323.4310209
Minimum0
Maximum50000
Zeros (%)1.8%
Mini histogram

Quantile statistics

Minimum0
5-th percentile0.167
Q118.25
Median20
Q3150
95-th percentile1000
Maximum50000
Range50000
Interquartile range131.75

Descriptive statistics

Standard deviation2055.203066
Coef of variation6.354378318
Kurtosis355.1879679
Mean323.4310209
MAD500.3765148
Skewness16.61776731
Sum661739.8687
Variance4223859.644
Memory size16.1 KiB
Histogram
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
20 811 39.5%
 
150 273 13.3%
 
200 102 5.0%
 
5 49 2.4%
 
0 36 1.8%
 
15 33 1.6%
 
1000 31 1.5%
 
8.5 25 1.2%
 
0.001 23 1.1%
 
10 22 1.1%
 
Other values (300) 641 31.3%
 

Minimum 5 values

ValueCountFrequency (%) 
0 36 1.8%
 
0.0002 1 < 0.1%
 
0.0005 1 < 0.1%
 
0.0006 1 < 0.1%
 
0.0007 1 < 0.1%
 

Maximum 5 values

ValueCountFrequency (%) 
50000 2 0.1%
 
24200 1 < 0.1%
 
21100 1 < 0.1%
 
19100 1 < 0.1%
 
15000 1 < 0.1%
 

Correlations

Missing values

Sample

First rows

countrydate_longdepthid_nolatitudelongitudemagnitude_bodymagnitude_surfacenamepurposeregionsourcetypeyearyield_loweryield_upper
0USA19450716-0.104500132.54-105.570.00.0TRINITYWRALAMOGORDODOETOWER194521.021.0
1USA19450805-0.604500234.23132.270.00.0LITTLEBOYCOMBATHIROSHIMADOEAIRDROP194515.015.0
2USA19450809-0.604500332.45129.520.00.0FATMANCOMBATNAGASAKIDOEAIRDROP194521.021.0
3USA19460630-0.204600111.35165.200.00.0ABLEWEBIKINIDOEAIRDROP194621.021.0
4USA194607240.034600211.35165.200.00.0BAKERWEBIKINIDOEUW194621.021.0
5USA19480414-0.084800111.30162.150.00.0X-RAYWRENEWETAKDOETOWER194837.037.0
6USA19480430-0.084800211.30162.150.00.0YOKEWRENEWETAKDOETOWER194849.049.0
7USA19480514-0.084800311.30162.150.00.0ZEBRAWRENEWETAKDOETOWER194818.018.0
8USSR194908290.004900148.0076.000.00.0NaNWRSEMI KAZAKHDOESURFACE194922.022.0
9USA19510127-0.355100137.00-116.000.00.0ABLEWRNTSDOEAIRDROP19511.01.0

Last rows

countrydate_longdepthid_nolatitudelongitudemagnitude_bodymagnitude_surfacenamepurposeregionsourcetypeyearyield_loweryield_upper
2041FRANCE199510270.095005-21.891-138.9830.00.0AEPYTOSWRMURUROAWTNUG19950.060.0
2042FRANCE199511210.095006-21.879-139.0320.00.0PBEGEEWRMURUROAWTNUG19950.040.0
2043FRANCE199512270.095007-21.881-138.9730.00.0THEMISTOWRMURUROAWTNUG19950.030.0
2044FRANCE199601270.096001-22.236-138.8150.00.0XOUTHOSWRFANGATAUFAWTNUG19960.0120.0
2045CHINA199606080.09600241.65088.7606.30.0NaNWRLOP NORHFSUG199630.0120.0
2046CHINA199607290.09600341.69088.3505.30.0NaNWRLOP NORHFSUG19963.012.0
2047INDIA199805110.09800127.07071.7005.30.0SHAKTI 1-3WRPOKHRANHFSUG19980.020.0
2048INDIA199805130.09800327.07071.7000.00.0NaNWRPOKHRANNRDUG19980.01.0
2049PAKIST199805280.09800428.90064.8900.00.0NaNWRCHAGAIHFSUG19980.035.0
2050PAKIST199805300.09800528.49063.7805.00.0NaNWRKHARANHFSUG19980.018.0