Nanopublication

ID

https://w3id.org/sciencelive/np/RAIqyPwP8PzMKSKCMKb04vzR1CpttIsAYqIRy6RkaBPZk
Formats

Content

@prefix this: <https://w3id.org/sciencelive/np/RAIqyPwP8PzMKSKCMKb04vzR1CpttIsAYqIRy6RkaBPZk> .
@prefix sub: <https://w3id.org/sciencelive/np/RAIqyPwP8PzMKSKCMKb04vzR1CpttIsAYqIRy6RkaBPZk/> .
@prefix np: <http://www.nanopub.org/nschema#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix prov: <http://www.w3.org/ns/prov#> .
@prefix npx: <http://purl.org/nanopub/x/> .
@prefix dc: <http://purl.org/dc/terms/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

sub:Head {
  this: a np:Nanopublication;
    np:hasAssertion sub:assertion;
    np:hasProvenance sub:provenance;
    np:hasPublicationInfo sub:pubinfo .
}

sub:assertion {
  sub:cnn-scattering-stacking-outcome a <https://w3id.org/sciencelive/o/terms/FORRT-Replication-Outcome>;
    <http://schema.org/endDate> "2026-04-22"^^xsd:date;
    <http://www.w3.org/2000/01/rdf-schema#label> "CNN + scattering stacking lifts mean rare-class recall on plankton classification by 8.4 pp at 0.72 pp top-1 cost";
    <https://w3id.org/sciencelive/o/terms/hasConclusionDescription> "The chain confirms that complementary multi-scale scattering features can be combined with a CNN's softmax probabilities via class-weighted stacking to lift mean rare-class recall on Decrop et al. 2025's 95-class plankton classification benchmark, at acceptable  cost to aggregate top-1 accuracy. The improvement is concentrated in classes where the CNN's training set is smallest — exactly the  rare-species long tail acknowledged as a limitation in the original paper. The trade-off is favourable for biodiversity-monitoring  applications where rare-taxon detection carries disproportionate ecological signal: 33 of 95 classes show meaningful improvement (more  than 1 percentage point) under stacking, 16 are meaningfully worse, and 46 are within ±1 percentage point.";
    <https://w3id.org/sciencelive/o/terms/hasConfidenceLevel> <https://w3id.org/sciencelive/o/terms/HighConfidence>;
    <https://w3id.org/sciencelive/o/terms/hasEvidenceDescription> """Evaluation on Decrop et al. 2025's released test.txt partition (33,718 images, 95 classes) using val-trained stacking. All metrics computed via scikit-learn against the integer labels in test.txt. Mean rare-class recall is averaged over the 13 classes with fewer than 200 examples in Decrop's train.txt.\\n\\n*— CNN alone (Decrop 2025 baseline): top-1 86.34%, top-5 98.70%, mean rare-class recall 47.70%.
— Scattering + LR alone: top-1 26.93%, top-5 60.43%, mean rare-class recall 42.99%.
— Naive 50/50 probability ensemble: top-1 86.28%, top-5 94.82%, mean rare-class recall 50.33%.
— Stacked LR (val-trained): top-1 85.62%, top-5 95.39%, mean rare-class recall 56.08%.
— Oracle ceiling (per-image hard switch between CNN and scattering): top-1 87.68%, mean rare-class recall 64.55%.

Stacked vs CNN: rare recall +8.38 percentage points, top-1 −0.72 percentage points, top-5 −3.31 percentage points. Per-class delta histogram: 33 classes improve by more than 1 percentage point, 16 worsen by more than 1 percentage point, 46 are within ±1 percentage point. Standout single-class result: Foraminifera (CNN train-set size 88), where stacking lifts recall from 18.2% to 36.4%. A 5-fold cross-validation estimate on the test set (58.2%) matched the held-out val→test result (56.1%) to within 2 percentage points, indicating the result generalises rather than reflecting val-split overfit. Full canonical results JSON archived at Zenodo 10.5281/zenodo.19687112.

Github repository: https://github.com/annefou/fiesta-scattering-bio""";
    <https://w3id.org/sciencelive/o/terms/hasLimitationsDescription> """Three caveats. 
(1) Single dataset. All evaluation is on Decrop's LifeWatch FlowCam BPNS dataset; the rare-class recall lift may differ on other plankton imaging datasets (IFCB, Zooscan, EcoTaxa) with different morphology distributions, image quality, or class taxonomies. 
(2) Single CNN architecture. The stacking is applied on top of EfficientNetV2-B0 specifically; whether the improvement holds with larger CNNs, vision transformers, or self-supervised foundation models is unknown. 
(3) Oracle ceiling not reached. The oracle (perfect per-image switch) achieves 64.55% mean rare-class recall versus stacking's 56.08% — there is an 8.5 percentage-point gap suggesting more sophisticated meta-classifiers (e.g., gating networks conditional on image features) could close further, though preliminary experiments with HistGradientBoosting on the same meta-features overfit (61 of 95 classes worse than CNN), so non-linear meta-classifiers are not straightforwardly better.""";
    <https://w3id.org/sciencelive/o/terms/hasOutcomeRepository> <https://doi.org/10.5281/zenodo.19701108>;
    <https://w3id.org/sciencelive/o/terms/hasValidationStatus> <https://w3id.org/sciencelive/o/terms/Validated>;
    <https://w3id.org/sciencelive/o/terms/isOutcomeOf> <https://w3id.org/sciencelive/np/RAauQR3eY4NGILF2ei5a6YLbZjT7JDK_tQGYyl2R4DcMk/decrop-2025-cnn-scattering-stacking-study> .
}

sub:provenance {
  sub:assertion prov:wasAttributedTo <https://orcid.org/0000-0002-1784-2920> .
}

sub:pubinfo {
  <https://orcid.org/0000-0002-1784-2920> <http://xmlns.com/foaf/0.1/name> "Anne Fouilloux" .
  
  this: dc:created "2026-04-26T19:46:43.293Z"^^xsd:dateTime;
    dc:creator <https://orcid.org/0000-0002-1784-2920>;
    dc:license <https://creativecommons.org/licenses/by/4.0/>;
    npx:introduces sub:cnn-scattering-stacking-outcome;
    npx:wasCreatedAt <https://platform.sciencelive4all.org>;
    <http://www.w3.org/2000/01/rdf-schema#label> "NP created using Declaring a replication study outcome according to FORRT";
    <https://w3id.org/np/o/ntemplate/wasCreatedFromTemplate> <https://w3id.org/np/RA2zljn0Nw9SadppOyxZoh-_Rxosslrq-vYG-p9SttnJE> .
  
  sub:sig npx:hasAlgorithm "RSA";
    npx:hasPublicKey "MIICIjANBgkqhkiG9w0BAQEFAAOCAg8AMIICCgKCAgEAoDcOiD+jen8awiJ6DB2ewDw66PeG64hODmgNFwy7GrwQui4HKnHdvxd++1UhTgiOfycxyxBb7sXPSikLw/1TsSyPsEl0P3/+600szxpTGgLNzW+bZ2DVP3d8ERMV1aWpH0ci3B/5vmK+vXQZ4uCoq57NE0MiFg5c13Gy0gd6n7wZYEhYM4AjWSLL0QS/HY+TFZMYL9bCFeATennGrlB2UEjRlw21UB2Ah16ZZ6hxQlfctFJZE7TGnBJPB3ttTjfcOfamhjZVwQ0yV9mv7x6PGiSmkzpJTVLjn8hagoKT05YUwVQArFb+w7f6sXqvvljMigjd/Rbqgbye/lLUAZLfJSnFM58TubfpEJvXV4zNMDEoT3VQ7dokgoLgMrmjZCKATtQ7gomocoTJ1NhN2esRNtGzWaS2obL/mueUQlMlavssZnqL8WICkdAuDlwDVNbsbwEWKQ50kiPdAdduSigifxA4CM7TgvnxqZVoAResEGP6UhTTem3T4CsbEas1Caj9wa7M1jPjACu5LF5BwcVns3ZQHWLipjRjD+9/ur3G8QtuxbNhmXlDYQ6tXxB1lK+Oz7O519b3bA15ilzFl0SdvMBGTe46xaQ9DsJT18THKnPbUhNMy0dH0VtzpB+EEaXZ25Fp9VHMEUqo1lLS9e89eO3efiqkESKQ7wmB+/DlIRcCAwEAAQ==";
    npx:hasSignature "F+LTZTk90Obbjohs2UvDj0K25iOJ2oTwACg/0vovabb0X+frfdhcE8yJEmBiY4syKGbLxRq2B0vHLXjkB6Uj4ExN2eNjbcohMlqIeFcTYxHHbgeUN7iMh8UvHwajCHY8mECb+jb6gmczJSd2iNX39jMhxX+RiRz31owWLMINQ+8puuX3AMqp4GMA2r4KkHnq+iWvYQWIYp2uTtMPxgDuauxFPICsa6jHlETjHVLPRjnmQs18aa5+l8hHxEwdRfZ9/g1cC6GxH4M0UKadcFVifMPn7w6b7tSfjBGiq9xYV9aRJK1n4Kt9Fpa1VucoZIUzWJ19v1bK5DNLAdcFtB4qCOcKa6GDceP8qh50DAWj5xOa1fYYPQ/lfk6G0v4d6XNDgRKgipTzzKmmt5PPqEkfuUI4lyLenS2YZvrPgmuS61WWkqGWe9DM3WzOXBOLXycEXw3oyXdjGzE7wREon0tT3UbQIFqQeoI1AT06K71QRbGJQMJRY6rP+FIMqC3pBt3yI5hueyCivEzMvWBtL+x32Z/3XuCGn79tT2NDygB4ZujNwTA1LmsFanXPjHgq12lhvya5anvUsi3rWConKQJam7rbVkoQi7JHMFCifLfOqScQhoHTIeW3BcND+leFSp8rULLZPXVWJFlRgXr/ToM4+G+M71ERB+LTwDQGsjiUV80=";
    npx:hasSignatureTarget this:;
    npx:signedBy <https://orcid.org/0000-0002-1784-2920> .
}