{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreiar2iowbfb7eoaa3e63vqtpnphtnby3ova2dai3tkzrvat7hqgioe",
"uri": "at://did:plc:wwyqal4cnqhuwyacdj7rqq3n/app.bsky.feed.post/3mla5q7vuxom2"
},
"path": "/t/feature-selection-in-causal-discovery/28735#post_1",
"publishedAt": "2026-05-06T15:53:56.000Z",
"site": "https://discourse.datamethods.org",
"textContent": "Hi! I am using a DAG approach in trying to better specify a research question examining stone-free rate after kidney stone treatment. I have constructed an a priori model using review of relevant literature, but wondering if it would be valuable to run some data-driven algorithms on variables in the dataset to understand if this a priori model aligns with the dataset. I know there are both constraint and non-constraint based methods, but wondering if there is a more systematic framework I should be using for feature definition and selection. One big limitation is that many of the actual mediators/mechanisms in the DAG aren’t measured in the dataset, although many of the parents to those mechanisms are.",
"title": "Feature selection in causal discovery"
}