{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreiar2iowbfb7eoaa3e63vqtpnphtnby3ova2dai3tkzrvat7hqgioe",
    "uri": "at://did:plc:wwyqal4cnqhuwyacdj7rqq3n/app.bsky.feed.post/3mla5q7vuxom2"
  },
  "path": "/t/feature-selection-in-causal-discovery/28735#post_1",
  "publishedAt": "2026-05-06T15:53:56.000Z",
  "site": "https://discourse.datamethods.org",
  "textContent": "Hi! I am using a DAG approach in trying to better specify a research question examining stone-free rate after kidney stone treatment. I have constructed an a priori model using review of relevant literature, but wondering if it would be valuable to run some data-driven algorithms on variables in the dataset to understand if this a priori model aligns with the dataset. I know there are both constraint and non-constraint based methods, but wondering if there is a more systematic framework I should be using for feature definition and selection. One big limitation is that many of the actual mediators/mechanisms in the DAG aren’t measured in the dataset, although many of the parents to those mechanisms are.",
  "title": "Feature selection in causal discovery"
}