Submitted by AutoModerator t3_zp1q0s in MachineLearning

Please post your questions here instead of creating a new thread. Encourage others who create new posts for questions to post here instead!

Thread will stay alive until next one so keep posting after the date in the title.

Thanks to everyone for answering questions in the previous thread!

15

Comments

You must log in or register to comment.

throwaway2676 t1_j0x89o6 wrote

Are there any ML subs or forums with a biology/medicine focus? It seems to be a rapidly expanding field, but it gets almost none of the attention relative to the flashier stuff (apart from Alpha Fold for a bit).

7

Straight_Bid_5577 t1_j1mxvt5 wrote

I second this. My university has a data science major with a biology emphasis domain with awesome courses. I have not gone through it, since I want to focus on ML/cybersec side of data science, however it does look intriguing and is definitely growing.

5

nuthinbutneuralnet t1_j0rcwd1 wrote

If I have a large set of input features (1000s+) and most of them can be categorized into one of several feature groups (metadata, feature extractions A, feature extractions B, etc), is it always necessary for your neural network model architecture to reflect your feature groups? For example, is it better to have one flat fully connected layer of all of the features to allow for any type of cross-interactions as opposed to, let's say creating linear or embedding layers for each feature group before combining them together. What are the pros and cons of each? What is usually done in practice?

5

alkibijad t1_j14x08t wrote

This may not be the direct answer, but it's applicable to many problems:

  1. Use the simplest approach first. This would be creating a simple model, in this case flat fully connected layer.
  2. Measure the results.
  3. If the results aren't good enough, think about what could improve the results: different model architecture, training procedure, obtaining more data...
  4. Iterate (go to 2)

Also:

`creating linear or embedding layers for each feature group before combining them together` - this adds additional knowledge into the network, so it may help... but in theory the network should be able to find this out on its own - the combinations that don't have much sense will have weights close to zero - that's why I advise you to start without it (and try doing it without it).

​

1K+ features: in some cases this is a lot of features, in some it's not that big number... but it maybe makes sense to reduce the number of features, by using some of the dimension reduction techniques.

5

throw1ff t1_j0tup0w wrote

Career question here! I'd like to join ML research/engineering but somewhat struggling with my academic background (PhD in physics). I do have 8yrs solid python data stack experience, shitload of github projects, did a project in ML for my science domain and have a tailored resume for that. But I find it really hard to compete for most ML roles which are either CV or NLP and require experience at those specific areas. For software engineering in ML, I am missing infrastructure skills such as data lakes, SQL, etc. I am in Europe and would love to join remotely. Any inspiration/advice/insight from redditors?

3

blacksnowboader t1_j19zld8 wrote

Maybe get a databricks free account and learn those skills about SQL and datalakes. Also, SQL is not that hard to learn. You can learn it in an afternoon.

2

vsmolyakov t1_j1p0eeg wrote

I recommend building up a portfolio of CV and NLP projects on GitHub, it will help sharpen your deep learning skills. Also getting experience with model deployment and cloud certifications (AWS, Azure, GCP) is valuable for ML engineering positions.

2

trnka t1_j1hbzj4 wrote

Not all positions will be open to non-traditional backgrounds, but many are. It looks to me like bigger or older companies can often look for a "traditional" resume for ML and smaller/younger companies tend to be more open-minded and focus on testing actual skills.

I've worked with a surprising number of physics PhDs in the ML space, at least in the US, so I don't see your background as a problem.

I know one of the physics PhDs did a coding bootcamp to transition into industry and that helped a lot.

1

AdFew4357 t1_j0wyyzf wrote

Have some questions regarding multivariate time series classification.

I have some data. experimental data where there are three measurements measured at each time stems. Each is a univariate time series (Xposition, Yposition, roadoffset, brake)

And I need to predict the response, which takes 3 values.

It’s a classification problem, and i have read literature on the various methods. I have tried distance based methods, and decided against it because the distance matrix is too costly. I have got some code with sktime library on using the methods with convolution kernels (ROCKET, ARSENAL), and some ensemble methods (TimeSeriesForestClassifer).

However, I thought of a different method myself, where I split the data, into 2 second intervals, leading to 10 chunks of 2 second intervals of data. Each chunk is roughly 3000 samples.

Now that I have these 10 chunks of data, I wanted to extract some features from each of these 2 second intervals for my multivariate time series, and use these features in a classification model.

My hope is to do this so I can capture some temporal specific features. However, I don’t know what would be useful features to extract. What would be some useful features to extract from such a series?

I was hoping to get feature vectors from each of these intervals with my response in order to construct another classifier. Or are there methods that are out there they already do this?

2

kraegarthegreat t1_j1cnl5h wrote

Kats by meta is a good tool for investigating feature extraction. I haven't done timeseries classification but from my brief work with Kats it seemed promising.

(Look for ideas there, use better tools for implementation)

1

Maria_Adel t1_j0s5ev3 wrote

What models would you recommend for forecasting demand for products ( unit movements) and what key variables would you include

1

abionic t1_j0ucem5 wrote

I've used Pytorch in few very simple example based projects. Have been thinking on trying out Pytorch Mobile & building a sample Flutter App, with it.

I've a 2 part query:

  • Does Pytorch Mobile performance or results get a lower quality than say a simple CPU run Pytorch?
  • Are there any good reference material/projects I can follow for Pytorch Mobile usage in Flutter?
1

AstroBullivant t1_j0voz77 wrote

Is quantization ultimately a kind of scaling?

1

Awekonti t1_j0y7esq wrote

>Is quantization ultimately a kind of scaling

Not really, it is about approximating (or better to say mapping) of real-world values that brings the limits. So that the model shrinks - computations and other model operations are being executed at lower bit-width(s).

2

trnka t1_j1hcd0f wrote

Adding a practical example:

I worked on SDKs for mobile phone keyboards on Android devices. The phone manufacturers at the time didn't let us download language data so it needed to ship on the phones out of the box. One of the big parts of each language's data was the ngram model. Quantization allowed us to save the language model probabilities with less precision and we were able to shrink them down with minimal impact on the quality of the language model. That extra space allowed us to ship more languages and/or ship models with higher quality in the same space.

1

[deleted] t1_j0vysef wrote

[deleted]

1

Awekonti t1_j0y85fy wrote

They deploy the models with well established pipelines. Usually (from my experience), scientists communicate closely with engineers (ML, Data Engineers or DevOps). They "put" the model into the environment where it can operate. It can be either the web service where u simply wrap the model, or container deployments (see Docker). They don't retrain the models every-time the app launches as they simply save the models parameters in that environment or CI/CD platform. I have a little experience in deploying and maintaining the production models, but I have no clue about the details tho.

3

hpxvzhjfgb t1_j0wqfx8 wrote

What is currently the highest quality speech synthesis tool freely available? The best and only realistic one I've found so far is Google's cloud TTS here. The actual cloud service isn't free, but the demo on that page seems to be usable as much as I want for free, so that works for me, although it's a bit silly having to use it like this.

1

ybhi t1_j13jkjj wrote

I
want to generate a tiny sample with the most natural voice that can be
generated today (with A.I. or anything else). I don't follow since a
time the field. What can be done today?

1

MedicUK_ t1_j0xv5np wrote

Hey everyone, I’m a medical doctor practising in the UK, I was going to be undertaking a project trying to use machine learning to predict mortality in patients with colorectal cancer. I was going to use a supervised approach using data over a series of different time points, i.e value X over 5 days post operatively, I was wondering is this something possible with machine learning I.e to use a trend to predict an outcome as opposed to a static value at one point in time, if so what statistical approach would be best to use?

1

trnka t1_j1hdb1o wrote

For prediction of mortality I'd suggest looking into survival analysis. The challenge with mortality is that you don't know when everyone will die, only some of those that have happened so far. They call this data censoring. So to work with data they reframe the problem into "predict whether patient P will be alive after D days since their operation"

A quick Google suggests that 90-day mortality is a common metric so I'd suggest starting there. For each patient you'd want to record mortality at 90-days as alive/dead/unknown. From there you could use traditional machine learning methods.

If the time points are standardized across patients you could use them like regular features, for instance feature1_at_day1, feature1_at_day2, ... If they aren't standardized across patients you need to get them into the same representation first. I'd suggest starting simple, maybe something like feature1_week1_avg, feature1_week2_avg, and so on. If you want to get fancier about using the trend of the measurement as input, you could fit a curve to each feature for each patient over time and use the parameters of the curve as inputs. Say if you fit a linear equation, y = mx + b, where x = time since operation and y = the measurement you care about. In that case you would fit m & b and then use those as inputs to your model. (All that said, definitely start simple)

The biggest challenge I'd expect is that you probably don't have a lot of mortality so machine learning is likely to overfit. For dealing with that I'd suggest starting very, very simple like regularized logistic regression to predict 90-day mortality. Keep in mind that adding features may not help you if you don't have much mortality to learn from.

Hope this helps! I've worked in medical machine learning for years and done some survival analysis but not much. We were in primary care so there was very little mortality to deal with.

1

MaedaToshiie t1_j0y9gmc wrote

I have some background in metaheuristics based optimization but not so much in machine learning. I get the feeling that metaheuristic methods are not commonly employed in machine learning. Is this primarily due to the objective function evaluation cost?

1

dangernoodle01 t1_j0yghp4 wrote

I might have the wrong question, but using GPT2, how would you begin feeding it "facts" about itself as the program, about it's surroundings, etc. I suppose repeating the same thing during training isn't exactly the solution to this. Thank you!

1

sanman t1_j0ynjfi wrote

How to Handle Lots of Missing/Null Values in Data?

There's a data set that I've been given to analyze, and it's got a lot of missing data. Typically, I should replace missing values with mean, or mode, etc. But one particular column has nearly 70% null values. What is the threshold to reject a column as unsuitable for analysis, instead of trying to replace those missing values? How large a proportion of missing values is acceptable before I have to reject/discard the column altogether? Is there some rule of thumb for this?

1

loly0ss t1_j0zjhsb wrote

Hello everyone!

I had a quick question regarding the KL divergence loss as while I'm researching I have seen numerous different implementations. The two most commmon are these two. However, while look at the mathematical equation, I'm not sure if mean should be included.

KL_loss = -0.5 * torch.sum(1 + torch.log(sigma**2) - mean**2 - sigma**2)

OR

KL_loss = -0.5 * torch.sum(1 + torch.log(sigma**2) - mean**2 - sigma**2)

KL_loss = torch.mean(KL_loss)

Thank you!

1

ybhi t1_j13jg74 wrote

I
want to generate a tiny sample with the most natural voice that can be
generated today (with A.I. or anything else). I don't follow since a
time the field. What can be done today?

1

pyepyepie t1_j14a34r wrote

Why do many papers are putting emphasis on performance comparisons and ignore the model's behavior?

Background - My first ML project was done around 2016-2017. It's funny to say but SOTA for NLP was nowhere near what it is today, so even though I am relatively new to the field, I observed how transformers completely change the world, not only the world of NLP.

Now, I am nowhere close to research scientist, my experience is implementing stuff, but I did read relatively many NLP papers (during work and a little for grad school) - and I see that there are many papers that are improvements upon a specific task, using "cheap tricks" or just fine-tuning a new model (BERT version 100X), to get better quantitative performance.

That being said, I have yet to see a situation where getting 96% vs 95% accuracy (hopefully more info but not always) on datasets that are often imbalanced is a meaningful signal that is even ethical to report as improvement without statistical significance tests and qualitative analysis.

Again, if I look at myself as someone who builds a product, I can't see when I would ever want to use "the best" model if I don't know how it fails - which would mean I would take a 93% model instead of 95% accuracy if I can understand it better (even because the paper was more explicit and the model is a complete black-box).

My question to the smarter & more experienced people here (probably a large portion of the subreddit), is what is the counter to my argument? Do you see qualitative improvements of models (i.e., classification with less bias, better grounding of language models) as more or less important in comparison to quantitative? And if I ask you honestly, do you ever read papers that just improved SOTA without introducing significant novel ideas? If so, why do you do it (I can see a few reasons but would like to hear more)?

1

trnka t1_j1heqwa wrote

In actual product work, it's rarely sufficient to look at a single metric. If I'm doing classification, I typically check accuracy, balanced accuracy, and the confusion matrix for the quality of the model among other things. Other factors like interpretability/explainability, RAM, and latency also play into whether I can actually use a different model, and those will depend on the use case as well.

I would never feel personally comfortable with deploying a model if I haven't reviewed a sample of typical errors. But there are many people deploy models without that and just rely on metrics. In that case it's more important to get your top-level metric right, or to get product-level metrics right and inspect any trends in say user churn.

> Do you see qualitative improvements of models as more or less important in comparison to quantitative?

I generally view quantitative metrics as more important though I think I value qualitative feedback much more than others in the industry. For the example of bias, I'd say that if it's valued by your employer there should be a metric for it. Not that I like having metrics for everything, but having a metric will force you to be specific about what it means.

I'll also acknowledge that there are many qualitative perspectives on quality that don't have metrics *yet*.

> do you ever read papers that just improved SOTA without introducing significant novel ideas?

In my opinion, yes. If your question was why I read them, it's because I don't know whether they contribute useful, new ideas until after I've read the paper.

Hope this helps - I'm not certain that I understood all of the question but let me know if I missed anything

2

vanilla-acc t1_j19rq8a wrote

Extremely simple question: but how do I make charts comparing 2 runs in wandb?

E.g, I want to make a report that has 5 charts. I want all 5 charts to only show 2 of my runs (out of a total of 10 runs). This should be easy, but I can't figure out how to do it.

1

Initial_Patient4994 t1_j1a47al wrote

For the MARS algorithm, how would you define a knot, really struggling to understand exactly what it is?

1

sanman t1_j1b5eb0 wrote

Is it possible to have Machine Learning for CAD designs? Could it be possible to train a model on a repository of CAD files?

Is highly structured vectorized data more efficient to train on compared to rasterized image repositories? How much more efficient?

1

cthorrez t1_j1dmafb wrote

Quick poll to see what the opinions are here:

Is it ok to tune the random seed and pick the seed which is best on the validation set?

1

trnka t1_j1hewwg wrote

It's ok but not great. Like say your model doesn't always converge, that would be one way to deal with it.

I'd prefer to see someone tune hyperparameters so that the metrics are minimally sensitive to the random seed though

1

ShortLuke t1_j1fsp5t wrote

Real basic question. I have about 4 years of pythons experience, a little Java, but never tapped into machine learning. Any recommendations for online classes to learn more about it? Not looking for a career, just the basics.

1

skeletons_of_closet t1_j1h4g7z wrote

should image augmentations (brightness , flip etc ) be performed before or after image resizing? want to know everybody's thoughts on that.Asked it in other forums and this is the answer I got

"It is generally recommended to perform data augmentation before resizing
the image. This is because data augmentation is used to create new
variations of the existing data, and resizing the image could
potentially distort or alter the original image in ways that might not
be desirable or meaningful. By performing data augmentation on the
original, full-size image, you can be sure that the augmented data is
representative of the original data and preserves the integrity of the
original image."

but if we are working with large images , example 1024*1024 isn't it better to resize to a smaller 224*224 and then do the augmentations as it saves time since less computations to perform.

1

Unique-Ice3211 t1_j1iefkc wrote

Hello! I am looking for implementation (or/and papers) of a diffusion network that could generate an image with more objects, starting from another image and some description e.g. image of a room, text "two people talking" -> generate an image of the same room, adding two people talking.Do you know if something like that exists? Is there a pre-trained model that I could play with?

1

Lintaar t1_j1k0ydm wrote

How many nodes should a random forest classification have? I'm only using it to determine feature importance nothing more.

Every documentation I can find suggest 3,5,7 or "whatever is proper for your data". Is it based on how many features I have? How many trees I have? How many samples I have? Some mixture?

Overall, 3/5/7 give similar results, but I don't know how to tell if its over or underfitted

1

GimmeFood_Please t1_j1lyqqu wrote

Have neural network architecture diagrams fallen out of favour?

I have been reading papers in graph representation learning (most relatively new) and I don't see them anymore. I guess I'm looking for things like whether there are dense layers on top of the GCNs, if so how many etc. Is examining the code where you go for such choices?

1

Straight_Bid_5577 t1_j1myag8 wrote

I am a sophomore in university, studying data science and want to peruse cybersecurity, and eventually the ever growing ML side of cyber. Fortunately, I worked helpdesk as my university my freshman year, and then used that experience to land a job as a Student Security Analyst at the uni’s information security office. Although I do experience imposter syndrome because I am the youngest one there, I love it a lot and am learning a bunch. Anyways, any advice or tips for me as I go through the next few years of uni that will help set me up for after I graduate? Thanks!

1

MohamedOsama36 t1_j1odi5h wrote

Hello everyone, I am a computer science student in my 2nd year, I am currently studying C, Discrete math and java and next year i get to choose whether i specialize in Ai or in general software, some examples of the subjects that are in the Ai section are machine learning, advanced machine learning, microprocessors, embedded systems, robotics, bioinformatics, Cryptography, HCI, image processing and many more, and i just needed to know what do i actually need to study in the meantime to prepare for all these subjects and i know i am gonna have to have studied linear algebra, calculus and discrete math so i probably should study math 1 and 2 again but can someone tell me how much math do i actually need and should i learn python with C or should i focus on another programming language ?
And another question what do Ai engineers do exactly? I really love the idea of Ai and want to work on making an Ai myself but i cannot find any articles or videos of actual Ai engineers explaining the whole thing.

1

sanman t1_j1qpdd0 wrote

What exactly is Validation data?
I know what Training data is for, and I know what Testing data is for.

But what is Validation data, and what is Validation for?

1

vsmolyakov t1_j1tc0dg wrote

Validation dataset is typically used to tune model hyper-parameters and for early stoping to tell whether the model is overfitting.

1

kyolichtz t1_j1z15vg wrote

Not sure if this is the right place to ask,

I've been a ML Engineer for 2+ years now in the same company (straight out of college)

I'm considering switching companies soon and was looking for potential project ideas to put on my resume.

Is there any place I can get ideas from? (Which aren't too generic)

My resume has just one project at present. I'm looking to have 2-3 total projects in my resume.

TIA

1

vsmolyakov t1_j22s8cc wrote

Consider Kaggle for a source of data science projects that look interesting to you. If you are looking for more ML engineering / cloud native type projects, I recommend “Practical MLOps” book by Noah Gift, you can also find his GitHub for ideas.

1

kyolichtz t1_j22sgpv wrote

Thank you, I was looking for more ML Engineering oriented, Noah Gift's GitHub looks good. Will have a read on his book as well.

1

IsAskingForAFriend t1_j1z5c15 wrote

Alright, I'm looking into getting into A.I. stuff. I apologize if machine learning isn't the same thing as A.I.

I'm your bog-standard IT guy. But ChatGPT really opened up my eyes to possibilities. What would it take to be able to learn to train a model and deploy in-house solutions? For my company, I'd like to take our knowledge base and SOP and turn it into an interactive guide you can ask questions.

Or other solutions for other places. I'm just excited and really can't stop thinking of possibilities. I feel like I need some grounding before thinking A.I. is the magic doanything stick you can whack a problem with.

1

trnka t1_j23ajbf wrote

I'd recommend starting with the [Andrew Ng Coursera specialization](https://www.coursera.org/specializations/machine-learning-introduction#courses). It's free and will give you a good base to build upon. I feel like he explains concepts very well and is good about explaining terminology.

> What would it take to be able to learn to train a model and deploy in-house solutions? For my company, I'd like to take our knowledge base and SOP and turn it into an interactive guide you can ask questions.

If the SOP is fairly short you can add it to your ChatGPT prompt and it can do Q&A from that. I found [Learn Prompting](https://learnprompting.org/docs/intro) helpful to understand how to do this.

I'm not sure about the knowledge base but it might be possible to inject that as knowledge too. The challenge is that there's a max input length.

But let's take a step back for a moment -- in general it's not too hard to learn ML basics and be able to build some model. Like it might take a few weekends depending on your schedule and previous experience with programming and math. If you want to solve a question answering problem, how much you need to learn will depend a great deal on how well you need it to work. For instance, you could probably get by with a simple search system for many things but it might not meet your bar for quality.

> I apologize if machine learning isn't the same thing as A.I.

I think of AI as the broader term and generally I think about the [AIMA table of contents](http://aima.cs.berkeley.edu/contents.html) for the general scope of AI -- machine learning is in there but there's a lot of other stuff too like logic, planning, ontologies, and optimization problems. That said, in the news AI is often used to mean "any technology that seems magical" and that's problematic because things like chess bots seemed magical in the past but no longer seem magical. So the scope of the term has shifted over time.

1

IsAskingForAFriend t1_j240d88 wrote

Thank you so much. I'll do that coursera as soon as possible so that I can begin to understand what's possible within my grasp. I'll get a better idea of what can fall in line after that. Thanks so much, I wasn't expecting an answer to such a vague question

2

requizm t1_j1zazxf wrote

Hey everyone. I have a simple tensorflow model that predicts if image is cat or not cat. I'm using a large image as an input. I wanna detect all cats in this input.

I'm using OpenCV SelectiveSearch Segmentation(opencv.ximgproc.segmentation.createSelectiveSearchSegmentation.switchToSelectiveSearchFast) for finding boxes in input. So lets say opencv gives me 6000 results. Even if I do preprocess and reduce the number to 3000. It will take too long to predict each result one by one with the model. Like 45-90 seconds. I was thinking of making a realtime application :P

TL;DR: OpenCV SelectiveSearch Segmentation gives me too many result. Even if I do preprocess, the number is high.

Is there any way to shorten this number? Or is there any other way to detect boxes without using opencv selective search?

1

writerwritesalot t1_j1zff8d wrote

How often should you check the validation/test loss when training a model? Specifically, in a situation where 1 epoch takes at least 24 hours to run.

1

vsmolyakov t1_j22uo5z wrote

The reason for checking the validation loss against the training loss is to see how well the model is learning and whether it’s overfitting. You would need as many data points as necessary to make that assessment.

1

roseinmybud t1_j22wbvo wrote

If I just downloaded Big Sur 11 of my MacBook Air, should I deinstall before downloading monterye or Ventura? I didn’t realize there were more recent updates? If I don’t deinstall Big Sur 11, and go straight to install ventura 13 will it take up more storage? I have a MacBook Air 2013?

1

billy_bouldering t1_j23988x wrote

I'm interested in learning more about AI alignment. I'm considering a PhD on the topic. Does anyone have any book/papers recommendations?

1

j15t t1_j23stzb wrote

I am trying to understand loss functions when working with sequential (time series) data. Specifically, simple next-step loss functions don’t seem to capture the nuance I would like and are brittle when projecting into the future. Are there some topics/keywords/papers that explore more advanced (expressive?) loss functions in this context?

I am most interested in: predicting higher-levels features of a time series (e.g. variance, confidence bounds) and long term predictions that are more robust. I don’t know how to describe some of the concepts I’m searching for, so some high level discussion or tutorial would also be very helpful. Thanks!

1

Pleasant-Resident-53 t1_j26cuzg wrote

I'm learning about simple gradient descent in regression models. And in my example 3-D graph, its shows the algorithm starting from one point and then eventually reaching a local minima. (lowest cost function). But isn't the whole point to reach a value of w and b which produces a J(w, b) of 0. On the graph the local minima describes a point of w and b which has a negative j(w,b) , but isn't this just counter intuitive to the whole point of the algorithm. Is having a negative J(w,b) good? or have I just misunderstood this.

1

em_Farhan t1_j28555z wrote

Hi, I want to go to the field of Machine Learning, I've a Computer Science background and cannot afford the college for the time being. I want to do self learning, I've found 2 resources "Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow Concepts, Tools, and Techniques to Build Intelligent Systems" by "Aurélien Géron" and "CS 329S: Machine Learning Systems Design Stanford, Winter 2022" which one should i select? Or any other good resource for self learners? Please Guide. Thanks.

1

Intelligent_Ad_7692 t1_j292awm wrote

Hello, I am new to the ML community and trying to learn more! Currently I am reading Deep Learning with Python second edition by Francois Chollet. I am trying to run the examples in collaborator, but when I try to run them I am running into errors with NameErrors: " " not defined. Just wanted some help or see if anyone has run into anything similar.

import tensorflow.keras as keras

model = keras.Sequential([

layers.Dense(16, activation="relu"),

layers.Dense(16, activation="relu"),

layers.Dense(1,activation="sigmoid")

])

model.compile(

optimizer="rmsprop",

loss="binary_crossentropy",

metrics=["accuracy"])

model.fit(x_train, y_train, epochs=4, batch_size=512)

results = model.evaluate(x_test, y_test)

---------------------------------------------------------------------------

NameError Traceback (most recent call last)

<ipython-input-2-52370d4415e6> in <module>

2

3 model = keras.Sequential([ ---->

4 layers.Dense(16, activation="relu"),

5 layers.Dense(16, activation="relu"),

6 layers.Dense(1, activation="sigmoid")

NameError: name 'layers' is not defined

&#x200B;

here is the githttps://github.com/fchollet/deep-learning-with-python-notebooks/blob/master/chapter04_getting-started-with-neural-networks.ipynb

&#x200B;

thx for any input:)

1

Adrien-Chauvet t1_j2a8y90 wrote

Hello, is any company in France working on Artificial General Intelligence, besides Deepmind's offices in Paris? I have a hard time finding info about it and most french companies I know of are working on very narrow applications of specific technologies.

1

Mustang1011 t1_j2am1nn wrote

What is something I can do when my linear regression model is not taking a feature that accounts for deviations into consideration for the prediction results?

1

Mustang1011 t1_j2awz8z wrote

How do I increase my linear regression model’s ability to consider a feature into the results? I have two features that both contain the desired outputs from the prediction and I would like to somehow get them to overlap their results. When I check for corr they do not share the same correlating features and when I predict X on each their results are almost contrasting even though predicting ok feature 1 produces more accurate results.

1

Arkq123 t1_j2bejqx wrote

I'm working on a project wherein we train a neural network for parameter regression. We've noticed our model's predictions vary a fair amount each time we retrain the same model architecture, presumably from the stochastic gradient descent. E.g.

  • Training session 1: The model predicts 10.26 for output variable Y
  • Training session 2: The model predicts 13.61 for output variable Y
  • Training session 3: The model predicts 8.14 for output variable Y

Is there some de facto way to build a statistical conclusion of our results?

I suppose the simplest method would be training the model say 10 times across different random seeds and presenting the mean and std for each output variable. But I'm not sure if there is a better or more standard way of doing this.

It seems like this would be a common thing to do but I'm struggling to find information - maybe I am searching the wrong keywords.

1

johnwayne2413 t1_j2c47xl wrote

In order for self-driving cars to have superhuman driving capabilities, would they also need to have superhuman senses? Like, vision, hearing, temperature, sonar, radar, lidar, etc. ?

1

pmac_red t1_j2cmqgg wrote

I've got a lot of experience writing software, specifically web services, but the AI/ML stuff is new to me. I'm reading a lot and can wrap my head around the code/frameworks side of it but the math and algorithm stuff is Greek.

I'm currently playing with AWS Sagemaker (seems easy enough and I've got an AWS account so it's easy). My goal is to experiment with a problem I have at work:

Full context:

We are an SaaS API product which customers integrate to.

Customer onboarding is a big focus right now. Integration payloads (JSON) can be pretty large, e.g. up to a couple hundred properties so it can be a little tricky for developers on the customer side to map from their internal system data format to ours. Product is approaching this as an education problem: customer documentation, examples etc to help teach the customer how to integrate. I think the problem is that it's just over the edge of being too big to build a complete mental model in your head of the system-to-system mappings so there's a lot of look up and reference. I think if some sort of ML model could be trained with existing customer data then a new integration could just present a payload and we could do most of the heavy-lifting automatically drastically reducing the complexity of the integration.

TL;DR

We have a target JSON document, customers have a source. I'd like to produce a set of mappings e.g. addr1 -&gt; streetAddress to predict how to map the source to target.

Is this a common problem? Is there a known algorithm/model I should look at or a family of which I should look at?

I'd appreciate any fingers pointed in the right direction.

1

trnka t1_j2d4wt7 wrote

There must be a name for this but I don't know it. It's a common problem when merging data sources.

If you have a good amount of data on existing mappings, you could learn to predict that mapping for each input field. The simplest thing that comes to mind is to use character ngrams of the source field name and predict the correct target field name (or predict that there's no match).

If you also have a sample of data from the customer, you could use properties of the data in each field as input as well -- the data type, range of numeric values, ngrams for string fields, string length properties, etc.

As for the business problem, even with automated mapping you probably need to force customers to review and correct the mappings or else you might end up with complaints from customers that didn't review.

All this isn't quite by area of expertise, hope this helps!

1

NotYourBaker t1_j2f4ylo wrote

I'm totally new to ML. Where do I start? A book, udemy course, tutorial to follow? I have about 10 years of experience in web applications development. but no AI/ML background. Thanks!

1

Obvious-Set-1981 t1_j0ynond wrote

Hi, everyone. From tutorials I know a thing or two about manipulating dataframes with pandas, but what if I want to build a real-world application with let's say a mysql database. How do I apply ML algorithms. Do I always have to find a way to convert the database to a CSV file first?

0

vsmolyakov t1_j22v7n7 wrote

You may want to use spark sql for big data applications, scala for processing the data, MLlib for machine learning and parquet file format for saving the results.

1