Abstract

The ability to pinpoint fishing activity in the world’s oceans has greatly improved over the past decades, a period in which both satellite-based vessel monitoring systems (VMS) and automatic identification systems (AIS) were introduced for fisheries control and maritime safety purposes, respectively. These data have been used extensively for fisheries research and have brought new insights into the spatial and temporal activities of many different fishing fleets. More recently, data products from Global Fishing Watch (GFW), derived from AIS data analyses, have boosted research. This is because GFW data resulting in identified fishing events is reported globally at high spatial and temporal resolution. However, working with pre-processed data comes with a risk because data scientists who may rely on GFW data products are unable to change the underlying assumptions used by GFW to define fishing events. In this study, we compare the fishing events identified by GFW with fishing events defined from self-sampling programmes on board two large pelagic fleets in the Northeast Atlantic. Within these self-sampling programmes, the exact position and time of hauls are meticulously reported, allowing for a comparison in both the number of hauls identified and the haul duration. Results reveal that the assumptions made by GFW to define fishing events lead to an overestimated duration of gear deployment within a range of 30%–380%, depending on the target species and vessel type. In addition, by comparing the self-sampling data with unprocessed VMS data, we demonstrate that it is likely that the activity in which vessels search for fish using sonar and echosounder equipment is mistaken for gear deployment. We recommend that authorities and GFW allow scientists free access to the unprocessed AIS data or that organizations such as GFW work closer with the fishing sector and scientific community to improve their data products.

Introduction

The ability to present fishing effort at global scales has improved greatly over the past decades with the introduction of satellite-based vessel monitoring system (VMS) (Murawski et al. 2005, Mills et al. 2007, Stelzenmüller et al. 2008) and later automatic identification system (AIS) (IMO 2015, Natale et al. 2015, Shepperson et al. 2018). In both systems, fishing vessels send a signal to a ground station or satellite at a regular interval, which includes information on their position, direction of travel, speed, and time/date stamp. VMS was introduced for control purposes, with a requirement for a signal to be sent every 15 min–2 h, depending on the control agency. AIS has been introduced for safety reasons, with a signal interval of a few seconds when vessels engage in activity at sea. Both sources have been used abundantly in fisheries research for those fleet segments that are legally required to carry VMS and AIS transponders (EC 1993, IMO 2015). The nature of the data is usually considered sensitive, as it may reflect specific knowledge on suitable fishing grounds, and as a consequence, it is protected by privacy laws in several places around the globe (Hintzen et al. 2012, Shepperson et al. 2018). For AIS data, however, several data providers collect, store, and distribute the data on global scales. MarineTraffic.com is one of the most well-known examples of an AIS based dashboard where all fishing vessels equipped with AIS are visible on a map or where historical AIS data can be purchased. Another supplier of AIS based products is Global Fishing Watch (GFW) (Kroodsma et al. 2018), which maintains a worldwide database of AIS data and produces publicly available data products for fisheries monitoring and research derived from the AIS data.

Although VMS and AIS data can be visualized without considering quality checks on the raw data, it is often more informative to analyse the data first (Vermard et al. 2010, Russo et al. 2011, Hintzen et al. 2012), and for example, link it to information on the fishing gear used (Eigaard et al. 2016). This allows for the separation of VMS and AIS pings associated with steaming to and from fishing grounds from those associated with the actual gear deployment (Mills et al. 2007, Joo et al. 2015) (the term ‘gear deployment’ is used to indicate that a fishing net is in the water column, while ‘fishing activity’ refers to all activities within a fishing trip (such as steaming, floating, searching, and fishing), and ‘fishing effort’ indicates the sum of gear deployment duration). Spatial distribution maps that focus solely on fishing effort can be used to assess the impact of bottom trawling (Eigaard et al. 2017, Amoroso et al. 2018, Kroodsma et al. 2018) or to study the distribution and displacement of fishing owing to the development of nature protection (Murawski et al. 2005) or offshore wind energy (Bastardie et al. 2015).

Defining when vessels deploy their gear is, however, not always a straightforward and simple exercise. Although demersal trawlers generally have a clearly defined bandwidth of fishing operation speeds (Rijnsdorp et al. 1998), for line, pots, gillnet, or purse seine fisheries, the vessel speed and changes in direction (Mills et al. 2007, Joo et al. 2015) are less informative because vessels tend to fish, haul, or float at low speeds making differentiating activity complex. For these fisheries, alternative methods need to be investigated, such as using machine learning (ML) methods that consider speed, change in heading, time of day, seafloor depth, and even environmental conditions as input to a fishing activity classification model (Galparsoro et al. 2024, Sales Henriques et al. 2024). Accurate detection of when vessels deploy their gear is key to drawing conclusions related to the impact of fishing, such as the amount of effort/fuel used to catch fish, the associated catch rates (i.e. catch per unit effort, CPUE (Gulland 1964)), or the potential for interaction with sensitive species in the area.

For pelagic trawling, fishing takes place within a certain vessel speed bandwidth. The interpretation of speed histograms to separate activity phases associated directly with fishing, including searching, trawling, and then pumping of fish on board, can be difficult because searching behaviour, making use of sonar and echosounders to locate schools of fish, may be executed at speeds comparable to steaming or fishing speeds. Vessels may also travel slowly (i.e. at fishing speeds) on their way to factories if there is a queue or slowly to port if tide may restrict entry time. Searching speeds may also depend on how far ahead fish schools are detected by sonar, with some sonars now having a 10 km detection range. Hence, identifying fishing activity phases based on speed may require additional knowledge about the fishery in question. It is therefore important to evaluate the accuracy of AIS data products, such as those produced by GFW (referred to as GFW data from here onwards) because they are routinely being used in scientific research world wide; for example, to display fishing effort (Natale et al. 2015, Zhang et al. 2022), study illegal fishing activity (Mullié 2019), or study fishing activity in areas designated for offshore wind energy (Stelzenmüller et al. 2022, Virtanen et al. 2022) and Marine Protected Areas (MPAs) (White et al. 2020, Sala et al. 2021, McDonald et al. 2024, Victorero et al. 2025). Furthermore, AIS data are being used to derive CPUE (Niu et al. 2024), an indicator often used in stock assessments to direct sustainable fisheries management or infer habitat use or habitat suitability (Yang et al. 2024) and estimation of sensitive species bycatch (Shea et al. 2023, Bell et al. 2025). Thus, there is a need to validate the AIS data products used to ensure that interpretation by users is straightforward and accurate. Extensive cross-validation with independently collected information on fishing activity could serve this purpose, as this study illustrates.

In Europe, several pelagic fishing organizations have adopted self-sampling programmes in which the position and the date/time of fishing events are meticulously recorded, which means that the exact fishing effort and fishing location can be determined. In this study we use such data collected by the Pelagic Freezer Trawler Association (PFA), an organization representing 17 freezer trawlers, and the Scottish Pelagic Fishermen’s Association (SPFA), an organization that represents 22 refrigerated seawater vessels (RSW). The self-sampling data are used to cross-check the GFW ‘fishing events’ data product and identify structural differences in the spatial location, number, and duration of fishing events. We discuss these differences, which are likely caused by inaccurate assignment of fishing activity based on predominantly vessel-speed-based rules, highlight the impact this may have on existing and future products from scientists using this data source for analyses, and recommend closer collaboration with scientific experts such as gear technologists and industry partners to improve these products.

Material and methods

Study area

The fishing vessels studied here are active in the Northeast Atlantic, specifically targeting mackerel, horse mackerel, blue whiting, and herring. Annually, PFA vessels land around 400 000 t of small pelagic fish and spend a total of 750 days annually fishing at sea. SPFA vessels land around 275 000 t of small pelagic fish and spend a total of 430 days annually fishing at sea. Catches of these small pelagic fish are bounded by annually set Total Allowable Catches (TACs) by relevant authorities taking into consideration scientific advice as provided by the International Council for Exploration of the Seas (ICES). The fishery is often considered as one with a low environmental footprint given its low CO2 emission per kg landed fish (Parker et al. 2018), low bottom impact (ICES 2021), and limited bycatch of Endangered, Threatened and Protected Species species (ICES 2024). Figure 1 shows the spatial distribution and summed duration of gear deployment as obtained from the self-sampling programmes.

Overview of spatial distribution of the PFA and SPFA fishery as used in this study. Colours indicate the summed duration of gear deployment over the 5-year study period.
Figure 1.

Overview of spatial distribution of the PFA and SPFA fishery as used in this study. Colours indicate the summed duration of gear deployment over the 5-year study period.

PFA self-sampling programme

The PFA is an association with nine member companies that together operate 17 (in 2023) freezer trawlers in six European countries (www.pelagicfish.eu). In 2015, the PFA initiated a self-sampling programme recording the species compositions by haul and regularly taking length measurements from the catch. The programme has been incrementally implemented in the fishery, with all vessels in the PFA fleet participating since 2018. All vessels used a dedicated software package, mCatch, for haul registration, where a haul refers to the activity of gear deployment.

In this study, PFA trip and haul registration data, consisting of date and time of both shoot and haul position, were used for the period 2018–2022 for a total of 15 freezer trawlers (two vessels fish as a pair-trawl and may complicate comparisons to other datasets). Only trips with near to complete haul recording [allowing up to two hauls to be missing out of an average of 27 hauls per trip, owing to failure to record all hauls (3%)] have been used for further analyses. Haul information was matched with species composition information, which was documented by haul and verified by the production administration onboard, prior to the fish being processed into their final product. Species composition data were used to assign the target species of the fishing trip, where the most abundant species caught during a trip was considered the target species. In this analysis, we used trips with herring, horse mackerel, mackerel, and blue whiting as target species in the Northeast Atlantic.

SPFA self-sampling programme

The SPFA (www.scottishpelagic.co.uk) represents its 22 member vessels in the UK. In 2018, the SPFA initiated what is now known as the Scottish Pelagic Industry-Science Data Collection Programme. The self-sampling programme required vessel crews to sample fish from every haul of every trip. Fish length and weight data were collected as the fish were pumped onboard, and information on haul location, date/time, and duration was recorded to connect the biological sample data to the location and date/time of the catch. Details of the sampling programme and how it developed are described fully in Mackinson et al. (2023) and references therein.

In this study, SPFA trip and haul data from 20 vessels pertaining to the period 2018–2022 were used (2 vessels do not participate in the self-sampling programme). Trips were assigned to fisheries according to the species targeted and season, resulting in three fisheries: herring, mackerel (autumn and winter), and blue whiting.

GFW dataset

Data from GFW can be obtained through their application programming interface and the use of the ‘gfwr’ R package (Global Fishing Watch 2022). GFW offers two datasets of fishing data to be extracted from their servers. The first one is an interpretation of individual vessel’s raw AIS data, which gets assigned a fleet segment such as fishing vessels or cargo vessels and thereafter aggregated into segments of fishing events (i.e. gear deployment) based on analyses of consecutive AIS pings of the vessel in question. Through the ‘get_event’ function in the GFW R package (Clavelle et al. 2023), this data can be accessed, similar to approaches taken by White et al. (2020), Niu et al. (2024), and Victorero et al. (2025). A second dataset, which is often used in research, is the apparent fishing effort (e.g. Shea et al. 2023, Yang et al. 2024, Bell et al. 2025), in which the above-described fishing events are aggregated at a pre-defined raster, and the sum of all fishing effort is shown within each raster cell.

Fishing event data were obtained within the date/time window corresponding to the self-sampling data. Each vessel-fishing trip combination query returned information from the GFW database on the start and end positions and date/time stamps of all the fishing events corresponding to a specified window. Other information returned by the query, such as bounding box information or calculated distances, was not required for further analyses, and was ignored. In the fleets studied here, it is illegal to turn off the AIS system, and we therefore assume the GFW to be complete, although outages are known to occur (Kroodsma et al. 2018, Welch et al. 2022).

Matching self-sampling and GFW data

Fishing events from GFW and haul data from self-sampling programmes were matched for each vessel using the Maritime Mobile Service Identity number (MMSI) as the linking key. The MMSI is a unique number for each vessel. GFW fishing events that fell within the range of the first and last haul of the self-sampling data were considered to cover the same fishing trip as listed in the self-sampling data. Matched GFW data were assigned the same trip numbers and target species as were available in the self-sampling data.

Indicators

Four indicators were calculated to illustrate the matches and mismatches between the GFW and self-sampling fishing event registration. All indicators were calculated per target species fishery and per vessel.

  1. The number of hauls within a trip.

  2. The average haul duration within a trip (in hours).

  3. Total hours of fishing effort for a trip.

  4. The centre-point of a fishing trip, taken as the average of the midpoints of all hauls within a trip.

Cross-validation with VMS data

For a selection of PFA fishing vessels (10 in total), VMS data (temporal resolution of 2 h) and daily logbook data were available for the period 2018–2022. From the VMS data, vessel identifier, GPS position, time-date stamp, and instantaneous speed were used to compare fishing events recorded in self-sampling data. These data were processed according to Hintzen et al. (2012), and fishing speeds were analysed and categorized according to Poos et al. (2013). Here, two categorization models were run, one identifying activity in three modes: floating/in harbour, fishing, and steaming; and one identifying activity in four modes, where the speed frequency recordings were divided into floating/in harbour, searching, fishing, and steaming, hereby deviating from the three mode model as described by Poos et al. (2013). The target species of each trip was assigned based on the dominant species reported in the logbooks. Fishing trips in the VMS data were linked to the corresponding trip in the self-sampling data based on vessel ID and date. For each VMS ping, it was determined whether the date/time stamp fell within or outside of a haul reported in the self-sampling data. This was used to calculate the share of VMS pings that were accurately assigned to fishing or non-fishing activity.

Results

Number of hauls identified

In total, there were 17 395 fishing events registered in the GFW data for the 35 vessels, compared with 20 573 for the self-sampling datasets in the years 2018–2022. There was a steady increase in the number of hauls registered as more vessels were added to both datasets (Table 1). All years considered had GFW estimated more than 70% of the number of hauls compared to the SS dataset.

Table 1.

Number of vessels, hauls, and total fishing effort registered by year for each of the datasets (SS = self-sampling).

 Number of vesselsNumber of haulsSummed haul duration
 PFASPFAPFASPFAPFASPFA
   GFWSSGFWSSGFWSSGFWSS
2018831316231747438308628514345
201912525003861584218 88412 71330049
202014173189451140923822 95914 0491617417
202115193985427675046326 27013 7923195878
202215174016431694850626 20112 1934169858
 Number of vesselsNumber of haulsSummed haul duration
 PFASPFAPFASPFAPFASPFA
   GFWSSGFWSSGFWSSGFWSS
2018831316231747438308628514345
201912525003861584218 88412 71330049
202014173189451140923822 95914 0491617417
202115193985427675046326 27013 7923195878
202215174016431694850626 20112 1934169858
Table 1.

Number of vessels, hauls, and total fishing effort registered by year for each of the datasets (SS = self-sampling).

 Number of vesselsNumber of haulsSummed haul duration
 PFASPFAPFASPFAPFASPFA
   GFWSSGFWSSGFWSSGFWSS
2018831316231747438308628514345
201912525003861584218 88412 71330049
202014173189451140923822 95914 0491617417
202115193985427675046326 27013 7923195878
202215174016431694850626 20112 1934169858
 Number of vesselsNumber of haulsSummed haul duration
 PFASPFAPFASPFAPFASPFA
   GFWSSGFWSSGFWSSGFWSS
2018831316231747438308628514345
201912525003861584218 88412 71330049
202014173189451140923822 95914 0491617417
202115193985427675046326 27013 7923195878
202215174016431694850626 20112 1934169858

Figure 2 and Table 1 show that for the GFW dataset compared to the PFA self-sampling dataset the number of hauls within a fishing trip are in almost all cases lower, while the opposite is true for the comparison between the GFW and SPFA dataset (Fig. 3). For the SPFA dataset, the ratio is 2:1 for blue whiting, 3:2 for herring and 1:1 for mackerel. For the PFA dataset, the ratio is 2:3 for blue whiting and herring, ∼1:1 for horse 247 mackerel and ∼1:1 for mackerel.

(a) boxplot of number of hauls within trips by PFA fishing vessel and data source [GFW dataset and self-sampling (SS) data], (b) boxplot of the haul duration (hours) within a PFA trip by fishing vessel and data source. (c) boxplot of hours of PFA fishing effort (the sum of number of hauls multiplied by haul duration) within a trip by fishing vessel and data source. Each panel represents a different target fishery. The first and third quantiles of vessel∼year combinations and data sources are represented by the shaded boxes, while minimum and maximum observations, not being outliers, are represented by the vertical lines. Outlying results (more than 1.5 times the interquartile distance away from the 1st or 3rd quantile) are represented by the dots. For visibility purposes, not all outliers are presented.
Figure 2.

(a) boxplot of number of hauls within trips by PFA fishing vessel and data source [GFW dataset and self-sampling (SS) data], (b) boxplot of the haul duration (hours) within a PFA trip by fishing vessel and data source. (c) boxplot of hours of PFA fishing effort (the sum of number of hauls multiplied by haul duration) within a trip by fishing vessel and data source. Each panel represents a different target fishery. The first and third quantiles of vessel∼year combinations and data sources are represented by the shaded boxes, while minimum and maximum observations, not being outliers, are represented by the vertical lines. Outlying results (more than 1.5 times the interquartile distance away from the 1st or 3rd quantile) are represented by the dots. For visibility purposes, not all outliers are presented.

(a) boxplot of number of hauls within trips by SPFA fishing vessel and data source [GFW dataset and self-sampling (SS) data], (b) boxplot of the haul duration (hours) within an SPFA trip by fishing vessel and data source. (c) boxplot of hours of SPFA fishing effort (the sum of number of hauls multiplied by haul duration) within a trip by fishing vessel and data source. Each panel represents a different target fishery. The first and third quantiles of vessel∼year combinations and data sources are represented by the shaded boxes, while minimum and maximum observations, not being outliers, are represented by the vertical lines. Outlying results (more than 1.5 times the interquartile distance away from the 1st or 3rd quantile) are represented by the dots. For visibility purposes, not all outliers are presented.
Figure 3.

(a) boxplot of number of hauls within trips by SPFA fishing vessel and data source [GFW dataset and self-sampling (SS) data], (b) boxplot of the haul duration (hours) within an SPFA trip by fishing vessel and data source. (c) boxplot of hours of SPFA fishing effort (the sum of number of hauls multiplied by haul duration) within a trip by fishing vessel and data source. Each panel represents a different target fishery. The first and third quantiles of vessel∼year combinations and data sources are represented by the shaded boxes, while minimum and maximum observations, not being outliers, are represented by the vertical lines. Outlying results (more than 1.5 times the interquartile distance away from the 1st or 3rd quantile) are represented by the dots. For visibility purposes, not all outliers are presented.

Average haul duration

Figures 2 (PFA) and 3 (SPFA) show that haul duration from GFW data is overestimated in most fisheries, except for the horse mackerel fishery, where averages are almost the same. An average haul takes around 4 h in the blue whiting fishery according to the self-sampling datasets, compared with 5 h and 20 min in the GFW dataset, suggesting an overestimate of around 33%. For the herring fishery, an average haul is between 1 h and 1 h 45 min according to the self-sampling data, while the GFW data suggests hauls last between 2 h 20 min and 3 h 20 min, giving an overestimate of around 100%–134%. For the horse mackerel fishery, the self-sampling and GFW recording are similar, both measuring around 2 h and 50 min. The difference is largest in the mackerel fishery, with 1 h and 10 min recorded in the self-sampling data, compared with 3 h and 25 min in the GFW data, which results in a 190% overestimate.

Fishing effort

By combining the number of hauls and the haul duration, one can calculate the total hours of fishing effort (Fig. 2 for PFA, Fig. 3 for SPFA). Overall, there are large differences between the GFW and self-sampling datasets for each of the fisheries, with the GFW dataset showing a 30% (PFA) to 230% (SPFA) overestimation in the blue whiting fishery, an overestimation in the herring fishery between 190% (PFA) and 240% (SPFA), a 60% overestimation in the horse mackerel fishery (PFA), and finally a 140% (SPFA) to 380% (PFA) overestimation in the mackerel fishery. In total, based on matching fishing trips, GFW estimates a total of 112 050 h of fishing effort, while the self-sampling data indicates 61 284 h, an 83% overestimation of fishing effort by GFW data.

Geographical position

No apparent differences in the geographical location of the midpoints of a fishing trip were found between the GFW and SS data (Fig. 4). The differences observed were small and well within the range of distribution, and medians across fisheries and datasets were nearly identical. The spread in the distribution overlapped well for all datasets with only minor differences in mostly the 1st and 3rd quartile ranges.

Boxplot of longitudinal and latitudinal midpoints of fishing hauls by data source [GFW dataset and self-sampling (SS) data]. Each panel represents a different target fishery and either longitudinal or latitudinal midpoint. The first and third quantiles of each data source are represented by the shaded boxes, while minimum and maximum observations, not being outliers, are represented by the vertical lines. Outlying results (more than 1.5 times the interquartile distance away from the 1st or 3rd quantile) are represented by the dots.
Figure 4.

Boxplot of longitudinal and latitudinal midpoints of fishing hauls by data source [GFW dataset and self-sampling (SS) data]. Each panel represents a different target fishery and either longitudinal or latitudinal midpoint. The first and third quantiles of each data source are represented by the shaded boxes, while minimum and maximum observations, not being outliers, are represented by the vertical lines. Outlying results (more than 1.5 times the interquartile distance away from the 1st or 3rd quantile) are represented by the dots.

Cross-validation with VMS data

A one-to-one comparison between self-sampling data and VMS data recording from the same trips indicated that speed alone is a poor estimator for gear deployment. Figure 5 shows the speed histogram for all VMS pings. Pings associated with fishing, i.e. pings that fell within the start and end time of a haul recorded in the self-sampling data, have a clear bandwidth in associated vessel speed, but many pings with similar speeds were not associated with fishing.

Speed histogram as derived from VMS data for matching trips from VMS and self-sampling programmes. The dark blue bars represent pings that were within the time window of a haul as recorded in the self-sampling data, while the light blue bars were outside any recorded hauls.
Figure 5.

Speed histogram as derived from VMS data for matching trips from VMS and self-sampling programmes. The dark blue bars represent pings that were within the time window of a haul as recorded in the self-sampling data, while the light blue bars were outside any recorded hauls.

Exploring challenges associated with using vessel speed to indicate gear deployment for this fleet, Fig. 6 shows the assignment of activities based on speed thresholds (the coloured dots) that relate to gear deployment periods as defined by GFW and the PFA self-sampling programme (vertical grey bars). Three examples are provided corresponding to the breakdown of individual fishing trips for three separate vessels, each for a different fishery. Even when searching activity is excluded from being counted as gear deployment, the analysis based on speed thresholds results in overestimates of the duration of gear deployment, but to a lesser degree. There is a true-positive registration of fishing pings by the VMS speed-based rule for 64.9% in all three trips combined with a false-positive registrations of fishing pings in 35.1% of the cases (34.2% false-positively identified as gear deployment and 0.9% false-negatively identified as no gear deployment). When searching is also considered as fishing, then these true-positive registrations go to 76.6% and false-positive registrations to 23.3% (17.5% false-positively identified as gear deployment and 5.8% false-negatively identified as no gear deployment). There is poor overlap between fishing event registration from GFW and SS, as is shown in Fig. 6 by the misalignment of vertical bars. Especially in the herring fishery example (Fig. 6b), the mismatch is large and likely driven by the short haul duration in this fishery, which cannot be replicated by the GFW algorithm.

Visualization of individual fishing trips for three separate vessels, each during a different fishery: (a) vessel during blue whiting fishery, (b) vessel during herring fishery, (c) vessel during mackerel fishery. Self-sampling recorded hauls are shown as numbered grey vertical bars in the bottom panels, while recorded fishing events by GFW are given in the top panels. VMS estimated gear deployment shown as dots, where green dots = steaming; orange dots = floating/hauling of gear; light blue dots = searching behaviour; and dark blue dots = gear deployment. Where VMS inferred activity does not overlap with GFW or PFA estimated activity, a cross-symbol is used, while in the case of an overlap, a dot-symbol is used. Dots/crosses do not represent AIS recorded speeds.
Figure 6.

Visualization of individual fishing trips for three separate vessels, each during a different fishery: (a) vessel during blue whiting fishery, (b) vessel during herring fishery, (c) vessel during mackerel fishery. Self-sampling recorded hauls are shown as numbered grey vertical bars in the bottom panels, while recorded fishing events by GFW are given in the top panels. VMS estimated gear deployment shown as dots, where green dots = steaming; orange dots = floating/hauling of gear; light blue dots = searching behaviour; and dark blue dots = gear deployment. Where VMS inferred activity does not overlap with GFW or PFA estimated activity, a cross-symbol is used, while in the case of an overlap, a dot-symbol is used. Dots/crosses do not represent AIS recorded speeds.

Discussion

This paper compared registrations of gear deployment deduced by GFW using AIS data with registrations of actual gear deployment determined from the PFA and SPFA self-sampling programmes. SPFA vessels use refrigerated sea water (RSW) to temporarily store fish in tanks so they can be landed fresh to on-land processing facilities, whereas PFA vessels pack and freeze their catch on board, landing frozen products at a later date. The differences in these processes, results in necessary differences in the way that each fleet operates, with the SPFA vessels making shorter trips and fewer hauls compared to the PFA vessels. Compared to the PFA vessels, SPFA vessels often take fewer but larger hauls, while haul duration is similar across both vessel types. When the self-sampled haul registration of both the SPFA and PFA vessels was compared to products derived from analysing GFW and VMS data, fishing event estimates derived from the datasets showed markedly different estimates in terms of estimated number of hauls. GFW products suggested that SPFA vessels took more hauls than they actually did, while PFA vessels were indicated to take fewer hauls than was the case. Haul duration estimated in GFW products showed a similar bias for both vessel groups, where hauls were expected to last significantly longer than documented from the self-sampling data. Although the bias diverged for a number of hauls between SPFA and PFA, the sum of fishing effort added up to a large overestimation of fishing effort in all cases, with a minimum of 30% overestimation in the PFA blue whiting fishery, up to as much as 380% in the PFA mackerel fishery. There was agreement in the geographical position of hauls from both data sources suggesting that the matching of GFW with SS data was successful. The GFW algorithm, making use of a Convolutional Neural Network (CNN) with many different variables, has been trained on thousands of records of different fishing vessels (Kroodsma et al. 2018). However, data used for training, as documented in the supplementary material in Kroodsma et al. (2018), show poor overlap with the fisheries studied here. Furthermore, GFW trained the CNN on different classes of fishing vessels, differentiating between, among others, purse seine, longline, and trawlers. The group of trawlers, however, is lumped, spanning in Europe many different types of fisheries such as demersal flatfish trawlers, demersal shrimp trawlers, demersal round fish trawlers, pelagic trawlers, etc. (Eigaard et al. 2016), each associated with different habitat preferences and fishing behaviour (Reijden et al. 2018, Hintzen et al. 2021). These factors could explain why the CNN algorithm is unsuccessful in estimating the correct fishing activity. Simpler models, based on speed thresholds alone but with a clear focus on pelagic trawlers, are similarly unsuccessful in accurately describing fishing behaviour. For demersal trawlers, these models are relatively successful where several authors have described methods to estimate fishing activity from VMS/AIS products and have been successful in validating the results with the help of fishers knowledge (Rijnsdorp et al. 1998, Mills et al. 2007, Poos et al. 2013). Most commonly used gear deployment estimation methods disregard the need for pelagic fishing vessels to search for fish, an activity for which they use acoustic systems such as forward-looking sonars and downward looking echosounders. While searching for fish, skippers tend to take a ‘picture’ of the fish underneath the boat first before shooting the gear. This activity, often undertaken at lower speeds, results in vessels turning and could be interpreted as gear deployment. This behaviour is different from that observed with other vessels, for example, demersal trawlers, which do not locate fish using acoustic devices and have to sample the area in search of resource hotspots (Rijnsdorp et al. 2022). When accounting for searching behaviour in VMS data analyses, as analysed in this study, accurate identification of gear deployment went up from 64.9% to 76.6%, demonstrating that it is likely excluding searching behaviour leads to a ML model misspecification.

Over time, the number of hauls identified in the GFW data became more in line with those registered in the self-sampling program. In 2018, only 57% of the total number of hauls registered in the PFA self-sampling data were captured by the GFW algorithm, while in 2021, the year with the highest resemblance and highest number of vessels covered in the dataset, there was a 93% similarity. This could be related to continued development of the methods, where GFW specifies that the methodology is still in development. The comparison is more complex to make for the GFW and SPFA data, as GFW overestimated the number of hauls consistently since 2018 with increasing proportion of overestimation from 9% in 2018 to 87% in 2022. As no output on most dominant variables explaining the behaviour of pelagic trawlers is presented by GFW, it is unclear where additional focus should be brought into the process to improve the CNN models.

The validity of the registered number of hauls, and haul duration in the self-sampling programme needs to be considered in this comparison as well. There is extensive quality control of the self-sampling data on haul registration, catch estimation, total landed catch, and production details in the freezer trawler fishery. When cross-checking haul catches with total catch estimation and landing or production details, one can identify any trips where hauls were not registered or were added by accident. For this reason, any trip that had mismatches in these numbers was excluded from further analyses. However, haul duration estimation is prone to personal interpretation. Some skippers may document a haul start when deploying the gear, while others may await the net sensors to indicate it is at fishing depth. As such, part of the discrepancy between GFW and self-sampling data products may be caused by a narrower interpretation of haul duration by skippers, which at times results in gear deployment estimates of only several minutes. Moreover, restrictions applied in the data pipeline by GFW filter out very short fishing events from being identified as actual gear deployment. This difference in processing would, however, likely result in an underestimate of fishing effort by GFW, not the observed overfishing estimate, but might explain the difference between PFA and SPFA haul detection as SPFA has, on average, longer hauls. The time it takes to set and haul fishing gear is dependent on fishing depth, making, for example, haul duration in the blue whiting fishery substantially longer as the gear needs to be lowered to 600–700 m, while the herring fishery can take place at much shallower depths (<150 m). If haul duration were underestimated by ∼1 h (the average time it takes to set and haul gear), it would bring haul duration estimates in line for the herring and blue whiting fisheries and would bring estimates closer in the mackerel fishery, but would suggest a larger gap for the horse mackerel fishery.

Providing data products such as GFW fishing event data is useful for science and fisheries management. The ability to construct global maps of fishing activity results in more awareness of the use of marine space and interactions with other stakeholders and allows for the identification of illegal, unreported, and unregulated fishing (Park et al. 2023). However, when these data products are used to infer specific vessel activity, for example, by scientists, managers, or NGOs (Seguin 2024) and linked to management implications, problems may arise over the interpretation of a biased data product. Although this study only compares pelagic fishing activity from a specific fleet segment, i.e. pelagic fishing in the Northeast Atlantic, given the lumping of all fishing trawlers in the analyses by GFW, which have demonstrably different behaviour, several local and global studies may be at risk of mis-interpreting results. Here we suggest that misinterpretation of fishing effort leads to incorrect analyses of the impact of fishing on numerous ecosystem aspects. This includes studies like Virtanen et al. (2022) and Stelzenmüller et al. (2022), who study fishing activity in areas designated for offshore wind energy. Biased fishing effort data leads to incorrect evaluation of the importance of areas at sea for fisheries and the potential available habitat to displace effort to. Similarly, Sala et al. (2021), McDonald et al. (2024), and White et al. (2020) used GFW data to draw conclusion upon the use of MPAs by fishing vessels, which in the case of the pelagic fishery would lead to concluding a presence or higher effort of fishing activity than is actually the case or even wrongful assumptions or erroneous conclusions on illegal fishing practices. Furthermore, biased estimates of fishing effort may lead to biased estimates of CPUE (Niu et al. 2024), an indicator often used in stock assessments to direct sustainable fisheries management. If effort would be consistently over- or underestimated, the use of these CPUE series would likely be appropriate for stock assessment purposes; however, we have demonstrated here that depending on the type of fleet segment, there is no consistent over- or underestimate. We showed that there is, on average, a minor difference in the spatial allocation of fishing between the GFW data and the self-sampling data, but given the differences in estimated total, erroneous inferences on habitat use or habitat suitability could be drawn. Yang et al. (2024) studied the tuna purse seine fishery to estimate habitat suitability and hence could be prone to similar bias in estimates of fishing effort as presented in this study. Finally, the estimation of bycatch of sensitive species relies heavily on assumptions made on the spatial and temporal overlap of fishing fleets with the distribution of these sensitive species. Both Shea et al. (2023) and Bell et al. (2025) used GFW apparent fishing effort data of pelagic longliners to estimate bycatch rates of different shark species. For these studies, reliable effort and spatial estimates of actual gear deployment are highly advised.

A straightforward solution to overcome some of the concerns raised on the quality of the GFW data would be to provide scientists with readily available, free of charge, unprocessed AIS data to allow them to develop their own analytical methodologies, or provide processed AIS data including uncertainty estimates on fishing activity identification. This also prevents scientists from having to use research funds to purchase AIS data on multiple occasions. Another avenue for improvement would be a closer collaboration between the fishing sector and organizations like GFW to collectively reduce bias in data products on fishing effort or work towards datasets that combine fishing activity data from different validated sources.

Acknowledgements

We like to thank three anonymous reviewers for their valuable contributions.

Author contributions

Niels Hintzen: conceptualization, data curation, formal analysis, methodology, writing - original draft, writing - review & editing. Katie Brigden: data curation, writing - review & editing. Henrik-Jan Kaastra: formal analysis, writing - review & editing. Steven Mackinson: data curation, formal analysis, methodology, writing - original draft, writing - review & editing. Martin Pastoors: conceptualization, data curation, writing - review & editing. Lennert van de Pol: data curation, formal analysis, methodology, writing - original draft, writing - review & editing.

Conflict of interest

Niels Hintzen and Steven Mackinson are employees of the PFA and SPFA respectively and coordinate the self-sampling programmes. All other authors declare that they have no conflicts of interest..

Data availability

The data underlying this article will be shared on reasonable request to the corresponding author.

References

Amoroso
 
RO
,
Pitcher
 
CR
,
Rijnsdorp
 
AD
 et al.  
Bottom trawl fishing footprints on the worlds continental shelves
.
Proc Natl Acad Sci USA
.
2018
;
115
:
E10275
82
.

Bastardie
 
F
,
Nielsen
 
JR
,
Eigaard
 
OR
 et al.  
Competition for marine space: modelling the Baltic Sea fisheries and effort displacement under spatial restrictions
.
ICES J Mar Sci
.
2015
;
72
:
824
40
.

Bell
 
JB
,
Fischer
 
JH
,
Carneiro
 
APB
 et al.  
Evaluating the effectiveness of seabird bycatch mitigation measures for pelagic longlines in the South Atlantic
.
Biol Conserv
.
2025
;
302
:
110981
.

Clavelle
 
T
,
Joo
 
R
,
Miller
 
N
 et al.  
2023
;
gfwr: access data from Global Fishing Watch APIs. R package version 1.1.1
.

EC
.
1993
.
Council Regulation (EC) No. 2847/93 of 12 October 1993 establishing a control system applicable to the common fisheries policy. 1:16
.

Eigaard
 
OR
,
Bastardie
 
F
,
Breen
 
M
 et al.  
Estimating seabed pressure from demersal trawls, seines, and dredges based on gear design and dimensions
.
ICES J Mar Sci
.
2016
;
73
:
i27
43
.

Eigaard
 
OR
,
Bastardie
 
F
,
Hintzen
 
NT
 et al.  
The footprint of bottom trawling in European waters: distribution, intensity, and seabed integrity
.
ICES J Mar Sci
.
2017
;
74
:
847
65
.

Galparsoro
 
I
,
Pouso
 
S
,
García-Barón
 
I
 et al.  
Predicting important fishing grounds for the small-scale fishery, based on automatic identification system records, catches, and environmental data
.
ICES J Mar Sci
.
2024
;
81
:
453
69
.

Global Fishing Watch, I
.
2022
;
Copyright 2022
.

Gulland
 
J
.
Catch per unit effort as a measure of abundance
.
Rapp p.-v. Réun Cons Int Explor Mer
.
1964
;
155
:
8
14
.

Hintzen
 
NT
,
Aarts
 
G
,
Poos
 
JJ
 et al.  
Quantifying habitat preference of bottom trawling gear
.
ICES J Mar Sci
.
2021
;
78
:
172
84
.

Hintzen
 
NT
,
Bastardie
 
F
,
Beare
 
DJ
 et al.  
VMStools: open-source software for the processing, analysis and visualisation of fisheries logbook and VMS data
.
Fish Res
.
2012
;
115-116
:
31
43
.

ICES
.
OSPAR request on the production of spatial data layers of fishing intensity/pressure
.
In ICES Advice: Special Requests Report Edited by ICES
.
Copenhagen
:
ICES
,
2021
.

ICES
.
Bycatch of endangered, threatened and protected species of marine mammals, seabirds and marine turtles, and selected fish species of bycatch relevance
.
In Report of the ICES Advisory Committee, 2024
.
ICES Advice 2024
,
2024
.

IMO
.
2015
.
Revised guidelines for the onboard operational use of shipborne automatic identification systems (AIS)
.

Joo
 
R
,
Salcedo
 
O
,
Gutierrez
 
M
 et al.  
Defining fishing spatial strategies from VMS data: insights from the world's largest monospecific fishery
.
Fish Res
.
2015
;
164
:
223
30
.

Kroodsma
 
DA
,
Mayorga
 
J
,
Hochberg
 
T
 et al.  
Tracking the global footprint of fisheries
.
Science
.
2018
;
359
:
904
8
.

Mackinson
 
S
,
Brigden
 
K
,
Craig
 
J
 et al.  
The road to incorporating Scottish pelagic industry data in science for stock assessments
.
Front Mar Sci
.
2023
;
10
:
1075345
.

McDonald
 
G
,
Bone
 
J
,
Costello
 
C
 et al.  
Global expansion of marine protected areas and the redistribution of fishing effort
.
Proc Natl Acad Sci
.
2024
;
121
.

Mills
 
CM
,
Townsend
 
SE
,
Jennings
 
S
 et al.  
Estimating high resolution trawl fishing effort from satellite-based vessel monitoring system data
.
ICES J Mar Sci
.
2007
;
64
:
248
55
.

Mullié
 
WC
.
Apparent reduction of illegal trawler fishing effort in Ghana’s Inshore Exclusive Zone 2012–2018 as revealed by publicly available AIS data
.
Mar Policy
.
2019
;
108
:
103623
.

Murawski
 
SA
,
Wigley
 
SE
,
Fogarty
 
MJ
 et al.  
Effort distribution and catch patterns adjacent to temperate MPAs
.
ICES J Mar Sci
.
2005
;
62
:
1150
67
.

Natale
 
F
,
Gibin
 
M
,
Alessandrini
 
A
 et al.  
Mapping fishing effort through AIS data
.
PLoS One
.
2015
;
10
:
e0130746
.

Niu
 
Z
,
Chen
 
Z
,
Yu
 
W
 et al.  
Temporal and spatial variations in squid jigging catch efficiency in the Oyashio Extension region
.
Fish Oceanogr
.
2024
;
33
:
e12692
.

Park
 
J
,
Van Osdel
 
J
,
Turner
 
J
 et al.  
Tracking elusive and shifting identities of the global fishing fleet
.
Sci Adv
.
2023
;
9
:
eabp8200
.

Parker
 
RWR
,
Blanchard
 
JL
,
Gardner
 
C
 et al.  
Fuel use and greenhouse gas emissions of world fisheries
.
Nat Clim Change
.
2018
;
8
:
333
7
.

Poos
 
JJ
,
Turenhout
 
MNJ
,
van Oostenbrugge
 
HAE
 et al.  
Adaptive response of beam trawl fishers to rising fuel cost
.
ICES J Mar Sci
.
2013
;
70
:
675
84
.

Reijden
 
KJ
,
Hintzen
 
NT
,
Govers
 
LL
 et al.  
North Sea demersal fisheries prefer specific benthic habitats
.
PLoS One
.
2018
:
13
;
e0208338
.

Rijnsdorp
 
AD
,
Aarts
 
G
,
Hintzen
 
NT
 et al.  
Fishing tactics and the effect of resource depletion and interference during the exploitation of local patches of flatfish
.
ICES J Mar Sci
.
2022
;
79
:
2093
106
.

Rijnsdorp
 
AD
,
Buys
 
AM
,
Storbeck
 
F
 et al.  
Micro-scale distribution of beam trawl effort in the southern North Sea between 1993 and 1996 in relation to the trawling frequency of the sea bed and the impact on benthic organisms
.
ICES J Mar Sci
.
1998
;
55
:
403
.

Russo
 
T
,
Parisi
 
A
,
Prorgi
 
M
 et al.  
When behaviour reveals activity: assigning fishing effort to métiers based on VMS data using artificial neural networks
.
Fish Res
.
2011
;
111
:
53
64
.

Sala
 
E
,
Mayorga
 
J
,
Bradley
 
D
 et al.  
Protecting the global ocean for biodiversity, food and climate
.
Nature
.
2021
;
592
:
397
402
.

Sales Henriques
 
N
,
Erzini
 
K
,
Gonçalves
 
JMS
 et al.  
Let's measure it: an approach of high-resolution estimates of bottom fixed net fishing effort at national level
.
Fish Res
.
2024
;
278
:
107118
.

Seguin
 
R
.
2024
;
Bulldozées
,
72
.

Shea
 
BD
,
Gallagher
 
AJ
,
Bomgardner
 
LK
 et al.  
Quantifying longline bycatch mortality for pelagic sharks in western Pacific shark sanctuaries
.
Sci Adv
.
2023
;
9
:
33
.

Shepperson
 
JL
,
Hintzen
 
NT
,
Szostek
 
CL
 et al.  
A comparison of VMS and AIS data: the effect of data coverage and vessel position recording frequency on estimates of fishing footprints
.
ICES J Mar Sci
.
2018
;
75
:
988
98
.

Stelzenmüller
 
V
,
Letschert
 
J
,
Gimpel
 
A
 et al.  
From plate to plug: the impact of offshore renewables on European fisheries and the role of marine spatial planning
.
Renew Sustain Energy Rev
.
2022
;
158
:
112108
.

Stelzenmüller
 
V
,
Rogers
 
SI
,
Mills
 
CM
.
Spatio-temporal patterns of fishing pressure on UK marine landscapes, and their implications for spatial planning and management
.
ICES J Mar Sci
.
2008
;
65
:
1081
91
.

Vermard
 
Y
,
Rivot
 
E
,
Mahévas
 
S
 et al.  
Identifying fishing trip behaviour and estimating fishing effort from VMS data using bayesian hidden Markov Models
.
Ecol Modell
.
2010
;
221
:
1757
69
.

Victorero
 
L
,
Moffitt
 
R
,
Mallet
 
N
 et al.  
Tracking bottom-fishing activities in protected vulnerable marine ecosystem areas and below 800-m depth in European Union waters
.
Sci Adv
.
2025
;
11
:
eadp4353
.

Virtanen
 
EA
,
Lappalainen
 
J
,
Nurmi
 
M
 et al.  
Balancing profitability of energy production, societal impacts and biodiversity in offshore wind farm design
.
Renew Sustain Energy Rev
.
2022
;
158
:
112087
.

Welch
 
H
,
Clavelle
 
T
,
White
 
TD
 et al.  
Hot spots of unseen fishing vessels
.
Sci Adv
.
2022
;
8
:
eabq2109
.

White
 
TD
,
Ong
 
T
,
Ferretti
 
F
 et al.  
Tracking the response of industrial fishing fleets to large marine protected areas in the Pacific Ocean
.
Conserv Biol
.
2020
;
34
:
1571
8
.

Yang
 
S
,
Wang
 
L
,
Fei
 
Y
 et al.  
Spatio-temporal variability of fishing habitat suitability to tuna purse seine fleet in the Western and Central Pacific Ocean
.
Reg Stud Mar Sci
.
2024
;
70
:
103366
.

Zhang
 
C
,
Chen
 
Y
,
Xu
 
B
 et al.  
The dynamics of the fishing fleet in China Seas: a glimpse through AIS monitoring
.
Sci Total Environ
.
2022
;
819
:
153150
.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
Handling Editor: Valerio Bartolino
Valerio Bartolino
Handling Editor
Search for other works by this author on: