3  Results

3.1 Passenger Load/Trips Analysis

3.1.1 Boxplot: Load Percentage by Direction

Comment:

  • Median Load Percentage: The median load percentage seems relatively different. NB and SB are the first 2 largest Load Percentage while WB and EB are 2 smalless, Which indicates North and South may stand more passengers while the other 2 direction have lower utilization.

  • Outliers: All directions seems a lot of outliers, which are not taken into calculation consideration, I think we could further explore those and remove potential outliers which may influence our results.

  • Variation: By seeing the interquantile of those directions, we can see that SB direction have more variation with respect to the load percentage, which indicates south direction should be explored further.

3.1.2 Horizontal barchart: Avg. Load Percentage per Borough

Comment:

  • Average Load: Staten Island shows the highest average load percentage. This could indicate a higher utilization of the transportation system, or possibly fewer transportation options leading to more crowded buses or trains.

3.1.3 Horizontal barchart: Avg. Trips with APC per Borough

Comment:

  • Average Trip: Staten Island stands out with a significantly higher average number of trips. This could indicate a higher frequency of trips or that fewer routes are being used more frequently. Therefore, we should combine with Average load to see it’s the bad utilization or we should actually increase more routes.

3.1.4 Cleveland plot: Avg. Load Percentage per Route

Comment:

  • Route Load Variation: There is a wide variation in the average load percentages across different routes. Some routes have a low average load, while others are approaching 50%, which indicates differences in route popularity, capacity, or possibly the time of day the data was collected.

  • Comparison Across Boroughs: It appears that Staten Island has several routes with higher load percentages, while routes in Queens have a lower average load. This could reflect differences in transportation needs or service levels across boroughs.

3.3 Multiple variable Interactions

3.3.1 Barcharts by Borough: Avg. Load Percentage with respect to Hour(Each Borough)

Comment:

  • Variability: Each borough exhibits variability in load percentages throughout the day. This could be reflective of different travel patterns, with peaks likely corresponding to rush hours.

  • Peak Times: Bronx: Shows pronounced peaks, which could suggest rush hours are more pronounced in the Bronx, with higher bus usage during these times.

  • Brooklyn: Exhibits a more even distribution with less pronounced peaks, possibly indicating a more steady usage of buses throughout the day.

  • Queens: Similar to the Bronx, there are distinct peaks that may correspond to rush hours.

  • Staten Island: Has a more varied pattern with multiple peaks, suggesting several busy periods throughout the day.

3.3.2 Scatterplot: Correlation between Load Percentage and Trips

Comment:

  • Data Distribution: A scatter plot with a lot of data points clustered at the lower end of the load percentage axis might suggest that most bus trips have a low to moderate load percentage. But the correlation relationship is not so apparent.

  • Outliers: There are points scattered more widely along the y-axis at higher load percentages (representing the Staten Island Borough), these could be considered outliers indicating occasional trips with significantly higher load percentages.

3.3.3 Stacked bar chart: Direction composition of each borough

Comment:

  • Graph distribution: From the graph, we know that only Brooklyn has 4 directions of routes, while Bronx and Staten Island have only North and South direction. Queens has only East and West. According to previous graphs, it indicates that North and South have higher load percentages. This is fit for that Bronx and Staten Island have higher load percentage while Queens has lower percentage.

3.4 Non-technical summary:

  • Busyness by Time and Place: Some graphs showed how busy the buses are at different times of the day in various boroughs. For example, there are times in the morning and evening when the buses are particularly full, likely when people are going to or from work.

  • Direction Matters: Other charts indicated that in some boroughs, buses going in certain directions are busier than others. This might tell us where more people work or live, or maybe where popular destinations are.

  • Weekdays vs. Weekends: There’s also a difference between weekdays and weekends. Buses are consistently busier on weekdays, probably due to people’s work schedules, while weekends show more varied bus usage, reflecting more leisure or irregular travel.

  • Different Boroughs, Different Patterns: We know that Staten Island is the Borough with highest load percentage and Trips with APC. While Bronx also have higher Load percentage, it have fewer Trips with APC, which indicates it may not utilize the transportation fully. Queens have the least load percentage, which means we may need to give less resources to Queens, and add resources to Staten Island.

  • Relationship: Because there’s not clear relationship between Load percentage and Trips with APC. Therefore, we should combine both to answer the question of Route optimization and further policy and planning.

3.5 References:

1. https://stackoverflow.com/questions/41940439/display-multiple-d3-js-charts-in-a-single-html-page

2. https://datawanderings.com/2019/11/01/tutorial-making-an-interactive-line-chart-in-d3-js-v-5/

3. https://d3-graph-gallery.com/graph/interactivity_button.html

4. https://forum.freecodecamp.org/t/d3-tooltip-cant-access-data-shows-undefined/440547