資料集檢視器文件

在資料集中篩選行

Hugging Face's logo
加入 Hugging Face 社群

並獲得增強的文件體驗

開始使用

在資料集中篩選行

資料集檢視器提供了一個用於在資料集中篩選行的 /filter 端點。

目前,僅支援包含 Parquet 匯出的資料集,因此資料集檢視器可以在不下載整個資料集的情況下索引內容並執行篩選查詢。

本指南將向您展示如何使用資料集檢視器的 /filter 端點根據查詢字串篩選行。您也可以透過 ReDoc 嘗試使用它。

/filter 端點接受以下查詢引數

  • dataset:資料集名稱,例如 nyu-mll/gluemozilla-foundation/common_voice_10_0
  • config:子集名稱,例如 cola
  • split:分片名稱,例如 train
  • where:篩選條件
  • orderby:排序子句
  • offset:切片的偏移量,例如 150
  • length:切片的長度,例如 10(最大值:100

where 引數必須表示為比較謂詞,它可以是

  • 由雙引號中的列名、比較運算子和值組成的簡單謂詞
    • 比較運算子為:=<>>>=<<=
  • 由兩個或多個簡單謂詞(可選地用括號分組以指示評估順序)組成的複合謂詞,並結合邏輯運算子
    • 邏輯運算子為:ANDORNOT

例如,以下 where 引數值

where="age">30 AND ("name"='Simone' OR "children"=0)

將篩選資料,僅選擇浮點型“age”列大於 30 且字串“name”列等於“Simone”或整數型“children”列等於 0 的行。

請注意,根據 SQL 語法,在比較謂詞中,列名應使用雙引號括起來 ("name"),字串值必須使用單引號括起來 ('Simone')。此外,如果字串值包含單引號,則必須用另一個單引號進行轉義,例如:'O''Hara'

orderby 引數必須包含列名(用雙引號括起來),其值將進行排序(預設為升序)。要按降序排序,請使用 DESC 關鍵字,例如 orderby="age" DESC

例如,讓我們在 ibm/duorc 資料集的 SelfRC 子集的 train 分片中篩選 no_answer=false 的行,並將結果限制在切片 150-151

Python
JavaScript
cURL
import requests
headers = {"Authorization": f"Bearer {API_TOKEN}"}
API_URL = "https://datasets-server.huggingface.co/filter?dataset=ibm/duorc&config=SelfRC&split=train&where="no_answer"=true&offset=150&length=2"
def query():
    response = requests.get(API_URL, headers=headers)
    return response.json()
data = query()

端點響應是一個包含兩個鍵的 JSON(與 /rows 格式相同)

  • 資料集的 features,包括列名和資料型別。
  • 資料集的 rows 切片和特定行中每列包含的內容。

行按行索引排序。

例如,以下是 ibm/duorc/SelfRC 訓練分片中 where 條件 no_answer=truefeatures 和匹配 rows 的切片 150-151

{
   "features":[
      {
         "feature_idx":0,
         "name":"plot_id",
         "type":{
            "dtype":"string",
            "_type":"Value"
         }
      },
      {
         "feature_idx":1,
         "name":"plot",
         "type":{
            "dtype":"string",
            "_type":"Value"
         }
      },
      {
         "feature_idx":2,
         "name":"title",
         "type":{
            "dtype":"string",
            "_type":"Value"
         }
      },
      {
         "feature_idx":3,
         "name":"question_id",
         "type":{
            "dtype":"string",
            "_type":"Value"
         }
      },
      {
         "feature_idx":4,
         "name":"question",
         "type":{
            "dtype":"string",
            "_type":"Value"
         }
      },
      {
         "feature_idx":5,
         "name":"answers",
         "type":{
            "feature":{
               "dtype":"string",
               "_type":"Value"
            },
            "_type":"List"
         }
      },
      {
         "feature_idx":6,
         "name":"no_answer",
         "type":{
            "dtype":"bool",
            "_type":"Value"
         }
      }
   ],
   "rows":[
      {
         "row_idx":12825,
         "row":{
            "plot_id":"/m/06qxsf",
            "plot":"Prologue\nA creepy-looking coroner introduces three different horror tales involving his current work on cadavers in \"body bags\".\n\"The Gas Station\"[edit]\nAnne is a young college student who arrives for her first job working the night shift at an all-night filling station near Haddonfield, Illinois (a reference to the setting of Carpenter's two Halloween films). The attending worker, Bill, tells her that a serial killer has broken out of a mental hospital, and cautions her not to leave the booth at the station without the keys because the door locks automatically. After Bill leaves, Anne is alone and the tension mounts as she deals with various late-night customers seeking to buy gas for a quick fill-up, purchase cigarettes or just use the restroom key, unsure whether any of them might be the escaped maniac. Eventually, when Anne suspects that the escaped killer is lurking around the gas station, she tries to call the police, only to find that the phone line is dead. Soon after that, she finds an elaborately grotesque drawing in the Restroom and then the dead body of a transient sitting in a pickup truck on the lift in one of the garage bays. She makes a phone call for help which results in her realization that \"Bill\", the attending worker she met earlier, is in fact the escaped killer, who has killed the real Bill and is killing numerous passers-by. She finds the real Bill's dead body in one of the lockers. Serial Killer \"Bill\" then reappears and attempts to kill Anne with a machete, breaking into the locked booth by smashing out the glass with a sledgehammer and then chasing her around the deserted garage. Just as he is about to kill her, a customer returns, having forgotten his credit card, and he wrestles the killer, giving Anne time to crush him under the vehicle lift.\n\"Hair\"[edit]\nRichard Coberts is a middle-aged businessman who is very self-conscious about his thinning hair. This obsession has caused a rift between him and his long-suffering girlfriend Megan. Richard answers a television ad about a \"miracle\" hair transplant operation, pays a visit to the office, and meets the shady Dr. Lock, who, for a very large fee, agrees to give Richard a surgical procedure to make his hair grow back. The next day, Richard wakes up and removes the bandage around his head, and is overjoyed to find that he has a full head of hair. But soon he becomes increasingly sick and fatigued, and finds his hair continuing to grow and, additionally, growing out of parts of his body, where hair does not normally grow. Trying to cut some of the hair off, he finds that it \"bleeds\", and, examining some of the hairs under a magnifying glass, sees that they are alive and resemble tiny serpents. He goes back to Dr. Lock for an explanation, but finds himself a prisoner as Dr. Lock explains that he and his entire staff are aliens from another planet, seeking out narcissistic human beings and planting seeds of \"hair\" to take over their bodies for consumption as part of their plan to spread their essence to Earth.\n\"Eye\"[edit]\nBrent Matthews is a baseball player whose life and career take a turn for the worse when he gets into a serious car accident in which his right eye is gouged out. Unwilling to admit that his career is over, he jumps at the chance to undergo an experimental surgical procedure to replace his eye with one from a recently deceased person. But soon after the surgery he begins to see things out of his new eye that others cannot see, and begins having nightmares of killing women and having sex with them. Brent seeks out the doctor who operated on him, and the doctor tells him that the donor of his new eye was a recently executed serial killer and necrophile who killed several young women, and then had sex with their dead bodies. Brent becomes convinced that the spirit of the dead killer is taking over his body so that he can resume killing women. He flees back to his house and tells his skeptical wife, Cathy, about what is happening. Just then the spirit of the killer emerges and attempts to kill Cathy as well. Cathy fights back, subduing him long enough for Brent to re-emerge. Realizing that it is only a matter of time before the killer emerges again, Brent cuts out his donated eye, severing his link with the killer, but then bleeds to death.\nEpilogue The coroner is finishing telling his last tale when he hears a noise from outside the morgue. He crawls back inside a body bag, revealing that he himself is a living cadaver, as two other morgue workers begin to go to work on his \"John Doe\" corpse.",
            "title":"John Carpenter presents Body Bags",
            "question_id":"cf58489f-12ba-ace6-67a7-010d957b4ff4",
            "question":"What happens soon after the surgery?",
            "answers":[
               
            ],
            "no_answer":true
         },
         "truncated_cells":[
            
         ]
      },
      {
         "row_idx":12836,
         "row":{
            "plot_id":"/m/04z_3pm",
            "plot":"In 1976, eight-year-old Mary Daisy Dinkle (Bethany Whitmore) lives a lonely life in Mount Waverley, Australia. At school, she is teased by her classmates because of an unfortunate birthmark on her forehead; while at home, her distant father, Noel, and alcoholic, kleptomaniac mother, Vera, provide little support. Her only comforts are her pet rooster, Ethel; her favourite food, sweetened condensed milk; and a Smurfs-like cartoon show called The Noblets. One day, while at the post office with her mother, Mary spots a New York City telephone book and, becoming curious about Americans, decides to write to one. She randomly chooses Max Jerry Horowitz's name from the phone book and writes him a letter telling him about herself, sending it off in the hope that he will become her pen friend.\nMax Jerry Horowitz (Philip Seymour Hoffman) is a morbidly obese 44-year-old ex-Jewish atheist who has trouble forming close bonds with other people, due to various mental and social problems. Though Mary's letter initially gives him an anxiety attack, he decides to write back to her, and the two quickly become friends (partly due to their shared love of chocolate and The Noblets). Due to Vera's disapproval of Max, Mary tells him to send his letters to her agoraphobic neighbour, Len Hislop, whose mail she collects regularly. When Mary later asks Max about love, he suffers a severe anxiety attack and is institutionalized for eight months. After his release, he is hesitant to write to Mary again for some time. On his 48th birthday, he wins the New York lottery, using his winnings to buy a lifetime supply of chocolate and an entire collection of Noblet figurines. He gives the rest of his money to his elderly neighbour Ivy, who uses most of it to pamper herself before dying in an accident with a malfunctioning jet pack. Meanwhile, Mary becomes despondent, thinking Max has abandoned her.\nOn the advice of his therapist, Max finally writes back to Mary and explains he has been diagnosed with Asperger syndrome. Mary is thrilled to hear from him again, and the two continue their correspondence for the next several years. When Noel retires from his job at a tea bag factory, he takes up metal detecting, but is soon swept away (and presumably killed) by a big tidal bore while on a beach. Mary (Toni Colette) goes to university and has her birthmark surgically removed, and develops a crush on her Greek Australian neighbour, Damien Popodopoulos (Eric Bana). Drunk and guilt-ridden over her husband's death, Vera accidentally kills herself after she drinks embalming fluid (which she mistook for cooking sherry). Mary and Damien grow closer following Vera's death and are later married.\nInspired by her friendship with Max, Mary studies psychology at university, writing her doctoral dissertation on Asperger syndrome with Max as her test subject. She plans to have her dissertation published as a book; but when Max receives a copy from her, he is infuriated that she has taken advantage of his condition, which he sees as an integral part of his personality and not a disability that needs to be cured. He breaks off communication with Mary (by removing the letter \"M\" from his typewriter), who, heartbroken, has the entire run of her book pulped, effectively ending her budding career. She sinks into depression and begins drinking cooking sherry, as her mother had done. While searching through a cabinet, she finds a can of condensed milk, and sends it to Max as an apology. She checks the post daily for a response and one day finds a note from Damien, informing her that he has left her for his own pen friend, Desmond, a sheep farmer in New Zealand.\nMeanwhile, after an incident in which he nearly chokes a homeless man (Ian \"Molly\" Meldrum) in anger, after throwing a used cigarette, Max realizes Mary is an imperfect human being, like himself, and sends her a package containing his Noblet figurine collection as a sign of forgiveness. Mary, however, has sunken into despair after Damien's departure, and fails to find the package on her doorstep for several days. Finding some Valium that had belonged to her mother, and unaware that she is pregnant with Damien's child, Mary decides to commit suicide. As she takes the Valium and is on the verge of hanging herself, Len knocks on her door, having conquered his agoraphobia to alert her of Max's package. Inside, she finds the Noblet figurines and a letter from Max, in which he tells her of his realization that they are not perfect and expresses his forgiveness. He also states how much their friendship means to him, and that he hopes their paths will cross one day.\nOne year later, Mary travels to New York with her infant child to finally visit Max. Entering his apartment, Mary discovers Max on his couch, gazing upward with a smile on his face, having died earlier that morning. Looking around the apartment, Mary is awestruck to find all the letters she had sent to Max over the years, laminated and taped to the ceiling. Realizing Max had been gazing at the letters when he died, and seeing how much he had valued their friendship, Mary cries tears of joy and joins him on the couch.",
            "title":"Mary and Max",
            "question_id":"1dc019ad-80cf-1d49-5a69-368f90fae2f8",
            "question":"Why was Mary Daisy Dinkle teased in school?",
            "answers":[
               
            ],
            "no_answer":true
         },
         "truncated_cells":[
            
         ]
      }
   ],
   "num_rows_total":627,
   "num_rows_per_page":100,
   "partial":false
}

如果結果中出現 partial: true,則表示由於資料集過大,無法在完整資料集上執行篩選。

實際上,如果資料集大於 5GB,/filter 的索引可能是部分的。在這種情況下,它只使用前 5GB。

< > 在 GitHub 上更新

© . This site is unofficial and not affiliated with Hugging Face, Inc.