Palm Internal - Pulse Deep-Dive (Tech Academy) - 2026-03-09¶

Metadata¶

Date: 2026-03-09
Company: Palm (Internal)
Palm Participants: Emma Sjöström, Rodel, Art, Giannis, Simon, Nielsen
Type: Internal Discussion (Tech Academy Spring 2026)
Domain Areas: Pulse, Data, Reporting, Scenario Modelling
Recording: None

Summary¶

Context¶

Internal Tech Academy session — the first of the Spring 2026 edition. Art presented the Palm MCP architecture and Pulse's technical foundations, followed by Rodel covering Palm Chat architecture, security model, context management, and the AI Digest homepage feature. The session also included a demo of scheduled agent workflows (data report agents).

Key Discussion Points¶

Palm MCP vs Palm Chat — MCP is internal-only (via Claude/Cursor), built on BigQuery with schema-first workflow to prevent hallucinations. Palm Chat is the customer-facing product with full control, logging, and extensibility.
BigQuery architecture — Uses dbt with silver/gold layers (no bronze). MCP accesses gold layer ~80-90% of the time, silver for complex queries.
Schema-first workflow — Key insight from hackathon: forcing the LLM to get_schema before querying eliminated table name hallucinations and dramatically improved accuracy.
Security model — Infrastructure-level security via GCP service account impersonation. Per-customer service accounts with row-level security on customer_public_id. No metadata leakage (row counts, etc.). Prompt-level security is insufficient alone.
Context management — Last 20 messages kept in full, older messages summarized. User summary persisted across last 10 threads. No retrieval yet, just summarization.
AI Digest (homepage) — Structured sections (summary, account balance activity, accounts needing attention, notable transactions, confirmed forecasts). LLM creates a plan, executes it, returns structured JSON with public IDs. Treasury API validates all references before rendering.
Scheduled agent workflows — Demo of data report agents that can be scheduled to run and produce PDF/Excel outputs. First step toward agentic capabilities.
Slack integration — /ask slash command planned for Slack app, same backend as Palm Chat.
Scenarios via natural language — Simon working on creating forecast scenarios through Pulse chat instead of UI buttons.
Forecast referencing — In-development feature to reference specific forecasts in chat for contextual analysis.

Pain Points¶

LLMs hallucinate table names without schema-first enforcement
Prompt-level security is easily bypassed — must use infrastructure-level controls
MCP lacks control/observability over user queries (can't validate, improve, or log)
Context windows get polluted when mixing cross-customer data
ACL entity-level filtering will require per-customer-per-entity service accounts (complex scaling)

Feature Requests & Needs¶

Personalized digest — Learn from user behavior to customize homepage blocks (e.g., Lucia wants high-level, Amanda wants details)
Learning loop — Use Palm Chat logs to discover what users ask about repeatedly, surface as digest blocks
Flexible file ingestion — LLM-assisted file upload for new data types (budgets, investment terms) without building static UIs each time
Unstructured document ingestion — Policy documents, investment policies as context for Pulse (requires retrieval/RAG)
Chat on every page — Contextual chat toggle that references current page/dashboard
Natural language scenarios — Create forecast scenarios through chat instead of buttons

Jobs & Desired Outcomes¶

Job: Answer treasury data questions without requiring engineering support Desired Outcomes: - Minimize the time from question to insight for treasury teams - Reduce dependency on engineering team for ad-hoc data analysis - Increase the accuracy of LLM-generated data insights

Job: Provide personalized daily cash digest to each treasury user Desired Outcomes: - Minimize the time to understand what changed overnight across accounts - Increase the relevance of proactive insights based on user role and preferences - Reduce the need to manually check multiple dashboards each morning

Job: Schedule recurring analysis workflows and receive reports automatically Desired Outcomes: - Minimize the manual effort to produce recurring treasury reports - Increase the timeliness of scheduled cash position updates

Domain Insights¶

Palm's data warehouse uses BigQuery with dbt (silver + gold layers, no bronze)
First externally-exposed Python service (everything else is Go) — chosen for better AI/ML tooling
MCP tools are imported directly as Python functions into Palm Chat service (no inter-service calls)
Firebase tokens used for user identification in both Go and Python services
Service account impersonation is the security backbone — custom roles with row-level filters
Hackathon origin story: Pulse started as a way to offload repetitive treasury analyst questions from engineering

Action Items¶

[ ] Everyone: Read Emma's AI correctness/governance framework doc on product hub and provide feedback
[ ] Team: Discuss evaluation methodology for Pulse responses (correctness, governance, company-specific context)
[ ] Schedule upcoming Tech Academy sessions: Instruments, Marketing/GTM, AI-assisted workflows
[ ] Finalize chat + AI digest launch for Wednesday marketing release

Notable Quotes¶

"A lot of these requests would take hours, if not days from our engineering team to sort of figure out the questions" — Art, on why Pulse was created

"We kind of enforce the LLM to get the schema before hallucinating names. And this is really what's working." — Art, on the schema-first breakthrough

"Doing security on this level is kind of stupid... that's why we actually do security in the infrastructure layer" — Art, on prompt-level vs infrastructure-level security

"We lack control once the MCP access is given — we actually have no idea what the users are querying" — Art, on why Palm Chat is needed alongside MCP

"If we learned from our users, their preferences... I think it's hard to ask people what do you want to see. They sometimes don't know themselves." — Emma, on the personalization vision

"This is just a first step into personalization. And I think it's a really powerful offering if we can take that further." — Rodel, on the AI Digest

Full Transcript¶

Meeting Title: Deep-dive: Palm Pulse Date: Mar 9

Transcript:

Them: I saw gasparin's tldv, but I think it's very useful to have at odb1. It was declined. Enough to try and add there to the VV to this goal.
Me: Yeah, I can do it otherwise, but yes, please try. If you haven't tried it out, you should try.
Them: I've added mine. It should be joining in a moment. Alrighty. We're recording. This works as expected. I think so. All right. To kick it off. Tech Academy, spring 2026. Edition. So after quite a successful winter edition last year, we're here to do a bit of more knowledge sharing. Yeah, I feel it's spring, so it's like 16 degrees. At least here. In my opinion, it's spring enough. To dive into it. Will have a few sessions actually in this edition, and some of them are actually not finalized yet. But when looking at what we did last time, we very much did a deep dive into the whole categorization and forecast domain. Of course, the core products. And we'll have plenty of more things that we will discuss around Forecast in general. But we are diving a bit deeper into different topics this time, and they're more diverse, I would say. So today we're going to dive into Pulse and see what's it all about. What the hype is, and Art will walk you through a lot of the stuff, and I'll do so as well. Next week we'll have a session around instruments. So very bluntly put investments in that. We'll dive deeper into that and how it works in the platform and why we are focusing on that. And then further on this month, next month, we haven't scheduled those yet, but we'll at least do one marketing and go to market session. So discussing with Maria on when to schedule that one, we'll probably end up this month. And then Simon already asked a question. Very, let's say, top of mind around how do we use our AI assisted workflows to the max? And that's around our code, that's around our skills. It's around everything I think that we do there. And we'll do a deep dive session on that one as well. To both share and discuss, actually, how we're doing this. All right. Let's dive into it. I'm giving the floor to Art. Hello. Sorry. Just going to be presentation mode before I start. Cool. Right. So, yeah, today I'm going to talk about pulse. I think most of you know what Paul says. But. Yeah, I'm just going to run you through why. Why? We have Pulse. What is Pulse? And then the future of Pulse. We're going to split this presentation between me and Rodel, so we'll switch screens a little bit. Yeah. Why? Paul we realized that a lot of the questions we were getting, like last year's three months ago, almost a lot of them were basically just us having to be the treasurers, analysts. And I think the idea when we started the hackathon was like, how can we sort of allow LLMs to take care of all of these requests that treasurers have? A lot of these requests would take hours, if not days from our engineering team, Jen and. Giannis to sort of figure out the questions, why the forecast were a certain way, what about the cash balances? And I think that's where it started. We basically just wanted to automate some insights. And then with Dornav Opus 4.5, I think it was. Where we realized the LLMs now are actually quite, quite smart. They're quite capable of interacting with our data and creating actual useful insights. I think Rodel tried this last year, and a lot of the ideas from the conversational agents, as we called them, were sort of just made new with more capable models. That's why we made Pulse initially, was to just answer questions, and now we sort of want to expand on that. However, in this session, I think what most people care about is how it's made. So Pulse is actually kind of two things. Kind of not, but we have the Palm mcp. This is actually what was built during the hackathon, and we've just improved it a little bit. And then we have the palm chart, which is what our customers will interact. Yeah. So PolyMC, I think most of the people have used it via Claude or cursor. So this is just internal. This is only allowed with a Usepalm domain whilst the palm chat is going to be user facing and anyone with a palm platform will have access to very soon. Yeah. So, Palm mcp. What's an mcp? It stands for Model Context Protocol. It's a standard way that an open source library, that anthropic build, and basically it's just a connection between a data source and a large language model. So in this case, home or data layer would be the server and everything in it, and then the client would be cursor or Claude or whatever. LLM people are using. It's really a very simple idea. It's basically just a formalized way on LLM interact with our data. So I try to make AI come up with a comparison and I think this was a good enough. If rest APIs is apps talking to data, then MCPS is basically LLM. Talking to data natively. So, yeah, basically what happens is that. The MCP basically is a layer on top of our BigQuery. There's a few more stuff happening there, but what we do is just expose the LLM to our bigQuery data. It interacts in a way that we, we instruct it to, and then it also has access to, to the tools. That sort of help the MCP get insights, fetch data, execute queries and whatnot? Yeah. So the big query, this is our data warehouse. We are only exposing the DBT layer. So basically, what we create in dpt, it's basically an engineered and transformed layer of our key data sources. We sort of try and follow a mixture of medallion architecture and then sort of using some of the state DBT schema suggestions. We don't follow the medallion fully because we first don't have a bronze layer, since a lot of our data is already in a good state before it comes to the DBT layers. But we do have the concept of a silver layer and a gold layer. And what's the MCP is usually accessing gold layer probably 80, 90% of the time is used to come up with insights and analyze the data. However, in some cases for more complex queries and insights, the MCP I think does go into the silver layer too. This is where the MCP gets a bit more creative and then sort of tries to find data that they might not find in the gold layer. Yeah. So that's the bigquery stuff. Quite simple, really. Like most of the work has been done before because that's what we're using for embeddable and whatnot. However, we have to create a set of tools, and these tools are basically rules on how the LLM can interact with our data. As I said. Yeah. Basically the first tools we built was to list the tables, got the schema, get the reference table, and then we give the LLM a sort of execute power. So basically, The LLM, after it finds everything it needs, it's sort of is able to then execute the final query. Tools in MCPS are quite interesting, so in most cases the doc strings in Python are quite useless for agents because they don't have access to it. However, with the MCP architecture, the doc strings are actually what help the LLM figure out what the tool is supposed to be. So for example, the execute query, the LLM would read the doc screen, it would get the arguments it needs, and then it gets the tool response that it would get once the SQL query is executed. Yes, but basically with tools we sort of control how our queries are made. So in the prompt we have the mandatory schema first workflow. Like I think during the hackathon, I quickly realized that the LLM has this tendency to just create and hallucinate on table names, so sometimes I would also hallucinate with the LLM, I'd be like, okay, this forecasting balance currency table, like this sounds like a cool name, but it really didn't exist. So a lot of the times you would have to go back and forth with the other to sort of be like, hey, this table doesn't exist. You use this and that, and that's why we created the Get Schema tool. So before any execution is done, we make the LLM, get the schema, list, all of the available tables, and then basically in that way, when it executes the query, it's executing on actual data that exists, not stuff that it wants to hallucinate or assumptions that it makes. Along the way, we also have the system from which gives more context to the LLM. So basically we are telling the mcp, the Pulse, that it's an AI treasury agent, it's working for multiple treasury teams. We are giving it some hints and tip, we tell it that fact transaction is one of the most useful tables and then it sort of just gets a direction, a sense of what its current task is. As most people know, after every LLM session you have, it's basically like it has forgotten all of the stuff that you have discussed and then sort of starts from scratch. So that system probably helps. It just set some basic guidance for the. Yeah, for the mcp. Yeah. I think this is really what increased the performance during the hacker someone we built this. We kind of enforce the LLM to get the schema before hallucinating names. And this is really what's. What's working. Because now, even. Even if it misunderstands the question, it will sometimes query the data and then realize that, yeah, sort of reformulates its own plans on how to figure out the task. Yeah. Key tables that I referenced. I have to say that we don't really know if this is true, but I think after three months of testing it, I think that most of the tables that are being used. Are the transactions table, account balances, forecast periods, and then the transaction period summary. I have to say that a lot of this is based on testing as well. So I think the dim forecast period is basically me trying to figure out if it can actually explain forecasts and why forecasts are not behaving or what's actually causing forecast to show certain values. Once we actually roll this out to actual customers, we will have more data. We will have proper data on what tables are being referenced the most. Yeah. Also dev and prodder separate environment, so the MCP will have context that it is querying in either dev or prod environment. So if you are working on a feature and the data is only in dev, you should like sort of mention to the MCP that hey, I actually needed that data. And then the MCP switches. We do also some query validation. So basically we kind of tell the prompt to start with a select or with. And this is basically to sort of discourage the LLM to try stuff like delete tables or drop everything. And yeah, the main idea of this is not actually for security, because doing security on this level is kind of stupid. However, it is to make the LLM more efficient for the mcp. For example, we don't need really need the customer isolation warning. However, if we are for example diving deep into ONST data, then the LLM is sort of aware that we are only looking at on state and we we are not looking at all of our customers. Most of our analysis are isolated to one customer. So we do have some very basic query validation. However, to do query validation and a prompt is kind of, as I said, quite retarded. So it's basically this meme. LLMs can vary very easily bypass system prompts. And even though the newer models are getting much better and much more compliant with system prompts, a user, if determined enough, would most likely outsmart the LLM model. And then it could maybe look at competitors data or maybe job part table so that's why we actually do security in the infrastructure layer. I think Rodel will talk more about this, because for the Palm mcp this is not too important. So basically we don't allow the LLM to do any sort of insert, update, delete, creator, alter, whatever. Like we sort of set this in the infrastructure layer. However, the other stuff is less important. So for example, the customer isolation warning, although we tell the other them to not do it, it really doesn't matter for the mcp because this is an internal tool. And most of the people will have access to all of our customers data anyways. Yeah, so basically, that's the big thing about Palm MCP. We have the BigQuery, the tools and then the LLM. So basically, this is one of the providers, the cursor Claude, and they sort of provide the model and all of the intelligence to make this MCP work. Yeah. LLM providers have very cool features. So for example, in Claude, you can ask Claude to make you graphs and then it does generate graphs and little small static web interfaces. All of the costs go to, to the LLM provider. So if, if, if we are having like trouble with AI development cost, then if, if the user has the mcp, then you know, that could be built straight onto their platform and not our site. However. Yeah, so with, with. With all of these cool things that LLM providers give and it's cheaper, it's easier. Why, why, why bother with Palm Chat? I think there's a lot of reasons why we still need the Palm Chat. One of them is that we lack control once the MCP access is given we actually have no idea what the users are querying, this means that we cannot validate the responses, we cannot improve the feature, and we really just have no control of our own tool. But there's also reasons where without a chat, without an platform capability, it's very hard to build new features. On top of this, the Pollem MCP will not be able to be more than a simple data analyst. It is technically possible. However, that would be quite, quite a big project feed and I don't think it would be worth it for us to try and expose that sort of level to the mcp. And of course, the MCP does require slightly techier users. Like, they would have to have access to Claude or Cursor or chatgpt, and they would have to sort of authenticate and add this custom mcp. So it's not really that straightforward. It's definitely not complicated, but it does require, I think, a slightly techier bunch to actually use the mcp. So that's. That's. That's really why we have Palm chat. I think it's time for you, Rodel. Thanks. I think I'd mention. Everything around MCP and also entering into into chat, but I'll dive into is a bit more of how it works behind the scenes. And I'll also dive a bit into the home based digest that we. That we are going to release. I will go back and share my screen again. Looking a bit into the. Into the setup of the. Of the architecture of the chat. User comes into the platform. We have a chat page. On the chat page, you can start new chats. You can go into previous ones. You can. You can boot up an old. So an old conversation, an old thread, you can start a new message. All that logic is handled by the treasury API. So everything that is is coming from, let's say, getting a thread to a conversation, listing them, doing what the previous messages are showing the output, etc. That is all core capabilities to treasury API. And it sits pretty comfortably there. Everything that comes to the generative AI capabilities is actually the first Python application that we're making, that we're externally exposing into, into the platform. Reasoning there is actually not that complicated. The tooling in Python is just way better than in Go Where Go. We use for pretty much everything else externally. The tooling setup there just makes it a little more difficult, I think, to have the same capabilities that you would have in Python. And additionally, one of the things that we actually do now, and that's actually a bit of a Python trick, which we might get rid of at some point. But in principle, the tools, the tools that that arc has described from the mcp. You can see the MCP as a standalone server. So we have it as a standalone server. But you can also just import the functions in your Python code and then you can expose it as tool calls to an LLM client. So then you don't need to have inter service calls, you just import the tools as Python codes, just how you would do in the MCP itself. But instead of calling another service, you would have them inside the bottom felt service. More convenient, a bit easier, I think, for us to go live. We might extract this properly at some point, but given that it's virtually the same thing as doing an interface call. But then you don't need to do that. It's just easier. But I think it's a. It's a detail, I think, that should not be overlooked. Art already explained that BigQuery plays the most important role right now in in the chat. So every tool that we're currently Expose into the LLM are BigQuery related. Of course, in the future that is not necessary. We can go wild in terms of the different tools that we can, that we can provide both MCP and in the chat. But as Art already alluded to, actually, the big query part, in terms of our own data, was probably the most difficult part, I think, to. To get right. So zooming into that a bit more. So the same way as we have in Go, where we can identify a user based on their Firebase token, we do the same thing in Python, where we identify the user base on their Firebase token. So we can see this is the user, this is the customer that it's related to. These are the let's say, additional fields that might be configured for that user. That informs the chat in terms of this is the user and this is the customer that we should use. If you go to the chat page after the session, you might see that there's a view as customer button. I think Simon might already have opened a pr, but we're getting rid of that one. If you would not stick it. It would basically be like you would be able to query everything. So it's like the mcp. You query across all customers. And pretty much no real boundaries there. In the chat. Of course you want to not have that. So one of the reasons, I think, is, is that we very much contain the, let's say, the platform to a specific customer. And as I'll also allude to later, if you mix data from specific customers, even on your own user, with a specific customer that you query and you query across all customers, it saves that context and just pollutes the context window, which is not that great. So we're getting rid of that. That functionality. Which means that if you go to a customer, so if you switch customer, if you go to a workspace, then you would only be able to interact with the data that is for that specific customer. How does this work? So we use something that is in GCP called service account impersonation. So what that means is so in pretty much every cloud provider, you have the concept of service accounts. And service accounts are basically are typically non user accounts that have certain access and permissions. What we do is we create service accounts for specific customers. So on each customer, we have a specific service account, and on that service account, we enforce specific permissions. What. What. What that means is we have a custom role which makes sure that if you go into the bigquery layer, you can only see data that are specific to specific rows with specific filters. So this filter that we currently use is on customer public ide. So any data set that we expose through the chat needs to have this field customer public id. And through the service account impersonation, we enforce a filter on that. Customer public id, meaning that it is not possible to get. Let's say if you are a specific user, you should not be able to bypass that. Internal users can bypass it. And also that one, we have a specific service account for Palm users that can bypass this specific filter. But it's not possible to bypass that if you are not identified as an internal user, and that is specifically only used palm domain. So if you have a useponder you can bypass it. I think the idea would be that we're getting rid of that, but in principle you can bypass it in that way, but for users on customers you may not. And therefore we always have customer public ID as an enforced filter. It cannot be bypassed. It's in the infrastructure layer. Also any other access in terms of like metadata of table. So you can, for example, think about row counts. Or some other information I think is not given to the service account. Meaning that they cannot get the row counts of tables, they cannot get extra metadata of tables. Only the data that they get access to. They can, they can query. And this is quite crucial because in, in theory or in practice, You. You can have data leakage and you can have LLMs exposing, for example, total row counts of transactions or that might not seem like a big problem. It can, of course it it exposes the form data. In general. So if they know we have a 10 million transactions in total, it exposes a bit of data that we don't want to. So if we manage that in this layer, service accounts have specific permissions, they cannot go outside of those boundaries. You can imagine that if we, for example, what we're now doing with ACLs, if you want to add. That a user can only see specific entities, then we need to have a specific customer and entity combination service account. So it definitely becomes trickier, I think, to go into those erections. And maybe there are some alternatives, things that we can think about. But this is the safest approach, making sure that we really restrict data access. In bigquery with row level security with a specific filters in place. If you're working with the chat and if you. You. You've worked with Claude at ChatGPT and others. And also in this cloud codes, it has the capabilities to remember things across sessions. In some cases you need to explicitly reference. In other cases does it automatically. In chat interfaces. In chat interfaces. It usually works automatically. And for example, I started with with art plots. Plots probably use uses some retrieval setup where they can just like query across the entire history. And they can see what is relevant and get kind of, all right, this is the relevant context. And then add that to the, to the context window. We have a bit of a simpler approach for now, but it's. I think we can definitely still expand on that further. And the simple approach is basically, if you look into specific conversation, into a thread, We the, the, the. The LLM always has context. On the last 20 messages, what has happened, what was the tool calls that were done, et cetera, what was the user's input, etc. After that, we summarize. So then we summarize what has happened in the last month. You miss messages, we summarize it, and then we make sure that we don't pollute the context window further. If the user goes and sends more messages. On top of that, we have a concept called User summary. So after the first five messages, we persist a summary of what has happened for that specific users. A user, and we do that across the last ten threads. So we restrict the amount of threads because otherwise I think you can imagine you just balloop the complex window and then the summarization gets worse in terms of performance. Retrieval probably would work better. But again, I think we tried to see what is the most relevant, probably the last 10 threads. We summarize it. And then based on that, we add that to the context window to make sure that it has cortex and what has happened. What we do. Log. And that's the power. I think of the palm chat currently. Is. We look. We look. Everything that happens, every interaction, every tool. Goal. Every message that is put in, every output we we log, we have in the database we can use for analytics. We can analyze it. However we want. And that can really, I think, on the short term, help us to find out. How does it work? Also evaluate it, being sure that I think we use everything that the user is interacting with and then try to make sure that we learn from that. Without doing that yet. But in theory, we can do all of this. That is, I think, the power to learn. Maybe shortly on On Slack haven't made this available yet, but in the Slack app that we're going to launch, there's a slash command called Ask. It uses the same path that Palm Check uses in the front end and has the same capabilities. It just works maybe a little bit less nice. There's like more rate limited in terms of like how much you can, you can stream. But in principle it's the same setup that, that you can use to interface with external applications like, like a slack in the future, maybe with teams and maybe different, different environments. However, we. We want. Then lastly on the homepage digest. So this is the new homepage. And how. That structure. This. It's, it's, it's. It. It looks quite dynamic, but if you really zoom into it. It's. It's actually quite structured. So I think if you, if you go to, like, Dev on some rare, you'll see that if you go to the, to the new home page, you'll see there's like a summary. Everything. There is like multiple transactions, notable account activity. Confirm forecast if there are any, et cetera. So there's a few kind of fixed sections. Summary is always on top. We always have a chat bar where you can dive deeper, I think, into the Digest if you, if you want. Everything that's underneath is, you know, a dynamic ordering is dynamic, so the ordering, it matters. What the preference is of the user. It matters what has happened, I think, in the last seven days. That is dynamic, but the format itself isn't that dynamic. And I think that really helps with giving the LLM enough context on what it should analyze and then what it should dive deeper into. So that's the first layer also. So if we look at the instructions, Instructions to System prompt is quite specific on. These are the sections. This is what you need to analyze. We can add in the feedback from the user. So if you go to the new home page, you'll have like a customize button and the user can put feedback that's user specific. It can take that into account. And then using all of that information. It would come up with a plan. So the, the, the plan would basically say, all right, this is a separate data set that you need to have as output. This is the steps that you need to take based on. It can analyze everything is like plan mode that you would use in cursor or in Claude. It comes up with a plan and then it executes the plan. And then from there you would have a structured data response, a JSON structure. With data per section that would be validated. And it can have, like, multiple. You can go back and say this. Let's say it's not correct in terms of data validation. You can go back and you can have the LLM fix it. In principle, it saves all of that information into the database and then in the treasury API. We have everything being further validated. So we pretty much reference the public ID so that the structure data has public IDs. Let's say which transactions we're interesting, which confirm forecasts were interesting, which accounts were interesting, and then the Treasury API will just fetch all of that. It will validate. Are they really existing? It can happen, of course, at something in the meantime, it's deleted. For example, it validates it, it does the lookup, it returns everything in a structured format that is readable by the front end, and then you can render it in the front end. In principle, simple. On the other hand, I think it's quite an interesting experiment of how, like, how much further we can take this, because it's really a first step into personalization. And I think it's a really powerful offering if we can take that further. All right. Yanis. One question regarding the blocks of the AI digest. Now I see that we have a confirmed forecast. The account balance. And notable transactions, if I'm not mistaken. Do we have more blocks, or are we focused only on those blocks? Those blocks are what we have, so. Again. Also, quickly, just. Write green here. So this is on some barrel on Dev. So we we have the summary. And then we have, we, we have typically have two blocks on the balance activity. So we have account balance activity, accounts needing attention. And then we have notable transactions and then even confirm forecast is not rendered here. Because it probably will also. There's. There's probably not nothing. Nothing to show. But yes. So the. The blocks are currently fixed. And it's basically what we. What we currently have, Emma.
Me: I just wanted to say, I think. Yes, this is the start, Right. It's just a very, very fast, like V0 of the Digest. But yes, I definitely think we should lean into learning as much as possible, and I think we'll probably talk more about that at some point, both evals of the POM poles and making sure we're correct and all of that, but also learning. Because if we learned from our users, their preferences, I'm imagining this page for Lucia at on who cares more about high level perspective? She wouldn't want to. She doesn't want to look at the account balance activity that's not at all relevant to her. Right. So what would her digest look like? Where is this digest might be more relevant as it is now to someone like Amanda that is much more nitty gritty and into the details. So if we can find a way to have a system that learns and then suggests or not just people into like, hey, this seems relevant to you? I think that would be a really, really cool vision to work towards, because I think it's hard to ask people, what do you want to say? They sometimes don't know themselves. But if we can see what they actually use our platform for, I think that can be a smart way to introduce. Ways to customize this or maybe help us come up with, hey, this block should exist. The users are asking about this a lot in the palm chat. Can that be a block? Maybe. Maybe that would be valuable so they don't have to repeatly ask the same things, like how can we lean really heavy into that? Learning loop. That would be so, like. That's the part I'm most excited about, to be honest.
Them: Definitely agree.
Me: Super cool.
Them: Yeah. Thank you. It's super cool implementation. Thanks both to Rodel and Art for summer writing how it's done. I was wondering, like. I understand. We limit hallucinations of AI Much more. Although by showing which databases to use and only clients data. But I was wondering because all the chats that I've seen are still using this claim like AI can make mistakes. Do we feel that confidence that we don't show it? Or should we still show it? I don't know the good answer. I just wonder how it was from the perspective, maybe. We should show it. I think art. Yeah, I was going to say the same. The only reason I think we don't show it at the moment is because it's used internally. And I think the users are quite educated internally on the capabilities of LLMs. I hope. I hope Jan is not making big decisions based on that yet. No. They just decided to pay off 100 million loan because of AI types. So no pressure. But. I agree on that. Ideas. Well, that even. Even if everything is not 100% correct. The context actually is. So maybe the numbers are not correct. But how to fund, let's say, different accounts, how some forecasts look are correct and this is all they need, a signal to investigate further. This already accelerates. A lot of their workflow. Yeah, for sure. I'm not saying no to that. It is just. I think they should understand that we have less control. On these numbers than what we show in tables, right? I. I think we should put a but of Beta or something like that to make it more clear. But we communicate it every time with the customers. Look, this is prototype, this is Beta. This is. It's a LLM. It can hallucinate. So they know that, at least for the prospects that we're talking. To and the customers we're talking to right now.
Me: I agree. Rodel.
Them: I do have a few slides that touch upon this. I'm not sure if we're showing them.
Me: I don't. So I don't want to hijack, but I'm going to share a link, and I think it's nice because I'm looking for a lot of input from everyone here. So there's a doc that I put together on the product hub.
Them: Rodel.
Me: It's a bit of a wreath. But I think I genuinely need feedback on this doc. In terms of especially the tech tech aspects of IT. The framework I'm looking to establish here for our AI native or natural language based, whatever products is correctness that we always strive for correctness. It's the answer actually, right? And then the second needs to be around governance. Is this answer allowed? And then a third allowed being boat, like, hey, we have policies at this company. The LLM can't give recommendation, for example, that violates policy. Or while this entity is not legally allowed to move money that way, or it's not possible. So there's other ways, other restrictions that we need to take into account, and the third one being about. Is this the right answer for this company? So, like the very, very nitty gritty company specific context, that is. I've given it my best effort, kind of outlining what I mean by that. But I think what we really need to do going back to learning going back to building tooling around learning evals, understanding how to appropriately evaluate model responses. What Jannes did with on when he extracted this skill from Rodrigo about his like process for what was it. Just like liquidity analysis of some sort, I think.
Them: Basically which accounts are going to go into Overdraft and where should we fund those accounts from?
Me: Yeah, Perfect. And that's like a super amazing blueprint for us to use and build towards and evaluate each step of that process that Rodrigo has. Are we delivering? Is it the correct answer here? Is it an allowed answer? And, like, are we taking into account on specifics? I'm just, I'm just going to keep pushing this, but I really encourage everyone to go in and read because I want us all to, like, align if this is the direction that makes sense or not. In terms of the tech. Cool.
Them: You had a question Whether you want to. What do we want to dive into further slides. I would leave it up to you. I think you wanted to demo something as well. I could demo, but, like, yeah, I have a few slides to touch upon evaluation and the future of Pulse, but. I think a lot of that has been discussed. Indirectly. Feel free to go through it. I have something small to show as well, but we can. We can end with that. Yeah. Let me just delete that slide. All right. Yeah. And this leaves us the future of. The future of pulse and the future pulses, of course, very bright. For the presentation this week, we are already working on integrating pulse and forecasting somehow. So what we are doing right now, this is still in development, so it's, it's not available outside dev, but we are actually trying to forecast, to reference stuff within the app. So if a user sees something weird in the forecast now they can use this cursor like Claude code tool where they can just add the forecast and then immediately start with the proper context rather than having to explain the LLM. What's happening. I think. I think this is quite nice because it saves the user time and it really directly links the. The source of the data into the chat. Yes, there's a question, I think. It's not a question, it's just more the thought. If you want users to use chat and app at the same time, To shoot in the future. Be able to use chats on every page, right? Yes, so that's the idea. But for now, it's a very simplistic approach. But what the reference references do is actually in the future, if we want to have a little toggle of a chat everywhere, then the toggle would actually be able to reference the. The code. I have to say that right now we are sort of approximately recreating every dashboard in the forecast. So it's. Yeah. We need to figure out a better way on how to directly get what the go layer gets and then how the front end renders it. But right now we are actually just agentically recreating every every chat. Next in line is forecasting scenarios. Like, scenarios are already possible, but you have to do it like a caveman with buttons and whatnot. But if you can do scenarios with natural languages. And this is some very cool feature Simon was working on last week. And the idea is that imagine you are starting an analysis, you find some insights, and then you're actually like, oh, actually, this changes the forecast like that. We. We've got some new insights. So the idea is that while chatting on the Pulse chart, you can actually just tell the Pulse to create a new scenario with the insights you've just discovered or the insights you want to actually explore. As Ezra. With. With row level access being figured out, we can get very creative. Like, now we have all. All the tools, all the. Yeah, all of the power to build whatever we think is best with with Pulse. The hardest part, which was data security and access levels have now been figured out. So yeah, it's really up to us to build new features and come up with more Pulse things. However, as you guys already mentioned, we do need to check the pulse. We do need to check the accuracy to evaluate responses. We have to check for hallucinations at least post post chat. We have to see the relevancy of the tools. We have to see if the proper context is being provided to the LLM Vllm is actually able to to generate the appropriate response. And I think ultimately, once we figure all of these out, then we have to. Focus on cost and efficiency whilst Palm MCB is completely free. Palm actually is using our token. So this will cost money, and if users start using it more and more, I think our cost will ramp up so we don't have to think about cost and efficiency and how to make this cheaper. And we have to constantly evaluate different LLMs and whatnot. But, yeah, this is it. I think Rodel has some very cool stuff to show. Yes. If no further questions for art, I will show the last piece. Actually, it was having some fun over the past few days. I was outside in the sun. But I had some agents do some work. And it's. It actually worked remarkably well. It needed some direction, but I'm fine with giving that now and then. But looking at the future of Pulse, I think this is a glimpse into the future. There's not actually very far away. I have to say, it's. It's a bit looking at what Otlar has released as well. But I think. We can do a lot more cool stuff with. With this and with this. I mean, with agents doing work inside our platform. And what that means is we've seen the big query example with retrieving some data. Getting data and then having some output, right? I think the first step that I tried to do was can you create a way to schedule these agents and then get the output that that comes out of all of this? And let's quickly take a look into it. This is all on my local, by the way. So let's give it a very descriptive name. Let's say. Past week. I'm just going to put this in for the example. And then let's try and test this workflow. What? What happens behind the scenes now it's going to execute. A workflow to to do what the the user has described. You can make it much more complex. Of course, you don't need to say my cash balances. You can do more complex workflows and probably need a way to structure that better than just one input box. But then again, I think this is just I think a first example of what it's what it can do. This will take a little bit of time. Art mentioned that we we need to grab the bigquery schemas. If it hasn't happened in the last 10 minutes and this is also my local so definitely did not happen in the last 10 minutes. It will query all all the schemas from bigquery it will get what is the latest schemas with all the column descriptions. I saw a question I think from from Nielsen that as well. And based on all those column descriptions, it will know how to query the data. I think most, most specifically. Let's. Let's hope. I think this one. This one works. But the way I think the summarization messages that we show here, we can also show in the chat. We don't do that yet, but I think that might be a nice addition where we can have a bit more information in terms of, like, what it's doing, but not with column names and table names. And these kind of things. All right. It's job. So cash balances passed last week. Actually forgot to to click the Excel or PDF button here. But what you can do is you can. You can also select like a PDF. Or you can do like an Excel and. I'm not going to make you guys wait for this one, by the way, but let's. Let's save it now. So we saved it. Let's go into it. By the way, this is just replicating the report page. But I think it's nice enough. I think as the first iteration credits to Jason and Tomoya for those designs, what you can do here is these will. These will actually run the schedule that you. That you have provided. And you can also trigger them as a one off event. I'll not make you guys wait. Because then we have to wait a few more minutes, I think, for it to complete. But in. In principle, it will show the text output here. We can actually look at an example. This one. You have to output and then you have a download PDF file. I'm not sharing my full screen, so you cannot see it, but it's actually a nicely formatted PDF file with the output that you that you see here. In this very tiny, small box. That's it. It's. It's a glimpse into what we can do further. But imagine hooking this up to files like clouds from. From a user, hooking it up to, like, a workspace, like a Google Drive. Where it can automatically find files that are maybe relevant integrations further, I think, into the ecosystem. Like, even with an mcp, like the bomb mcp, you could automatically schedule these workflows. So there's, I think, a lot of possibilities that even live outside of the BOM platform. But then still being able to capture exactly what's being done. So limitless possibilities. Is the answer.
Me: Love that. Love the limitless possibilities. Now, I know it's a bit like on the tangent, but for files. Will we at some point be able to explore more in depth file in Justin so more directly LLM assisted forecast files being ingested or that budget file for example, like would it be possible to give us a bit more flexibility? I'm asking the team than the current forecast ingestion feature, for example. That we have.
Them: File uploads, you mean? Right. There's more. More flexible file uploads.
Me: Yes. And, like, not having to. So for every new type of data that the user wants to ingest, maybe all we need to do in a perfect utopian world, it's defined a database schema and, okay, we have a new type of data now, and we have a table for that. But have a like quick way to set ourselves up for allowing users to drag and drop these files onto whatever surface in palm and then. Build a really neat, like, user experience around having that file properly stored. So it becomes like reusable data in a sense that the user can reference later on as well.
Them: I can give a bit of my perspective. I think. I think there's. There's two things I think here. On the one hand, it's the file uploads that we have today, but then smarter, more general. What it. What it does is if you have files, you map those files into a specific data. Schema on our site. So you can do this for like, you can do it for bank statements, you can do it for forecast, you can do for investments. I think that is, I think all possible. I think we should, we should definitely have. And then I think there's also another angle to this, and that's actually filed ingesting for. For. For posts, I guess. And, and, and that could be more as a general context, you might do it for both, so that you have both. But if you ingest it, I think in the first example that I mentioned, like with bank statements or forecast investments, You already have it in the system and then the post can use it that is already mapped. Then you have ways. I think maybe that we need to ingest more unstructured information. So not. Not like. Not a PDF of like a bank statement, but more like a PDF of a. Policy, right? Like an investment policy or a certain information. I think that becomes quite interesting because that we need to save that information and then we need to make sure that the, the LLMs can, can reference that, which means retrieval, which means we need to build quite some new cool stuff, I think, around that.
Me: So super quick. The reason I'm asking is I'm seeing where we're going to need a near term a lot of capabilities to ingest also or allow users to input be it investment know. Budgets from, like, fpna, like. So I think if there is an option to do it nicer using this. Instead of us building more static UIs for it. It's just like something I wanted to bring up. And we can. We can have a think that I would be cool to look at. Cool.
Them: Let's see. It's done now, so it took like a minute or so, but. Yeah, I need to share my full screen if you want to see the video, but anyway.
Me: Super cool. Love it.
Them: Now, the difficult question when can we have this available to customers? Learning from. I think a lot of the stuff that we did before, like the. The chat that I just. It wasn't actually too difficult for the AI to come up with this. The front line was actually the most difficult part. But given that it's reusing a lot of, I think, things that we already have, it didn't make that too difficult. I think probably we wanted to make it make a lot nicer, but I don't think it's too far fetched. I think. To. To make this available somewhere in the coming weeks. Nice. Including the AI digest and all these, right? Yeah, that what we do this week. So the. The chat and digest are aimed to go live on Wednesday together with the marketing release. And then we can. We can see what we do around the whole agent scheduling thing. I haven't discussed that. Should we maybe save the name agents for. For something else. When we actually start. Agentic works within pulse. Yeah, that's a question. It's a bit of. It's a bit of an agent, right? It's a data report agent, kind of. But it's the very first simple one. No, I was just thinking. In the near future, when Pulse makes banks, bank transactions, you know? But that's. That's a different type, I think. I'm not going into the tangent here, but in the database, there's also a type of the agent. This is just a data report agent, but you could also have an execution agent, right? You schedule an execution agent that looks at what happened the last day, and then you can have it, like, come up with suggestions, I think, to what. What to do. But maybe we should name this differently. I don't know. I think that's all to be decided. I don't have a very strong opinion on it.
Me: I think we can experiment with different things and messaging can change and, like, let's just do stuff, and then we'll learn on the way as well what to call what to call it. I love that we're just getting it out first. And, I mean, it's fine. Too. Okay, we'll call it this. I prefer that and the other way around.
Them: Any further questions? Thank you all for, for joining. And thank you, Art, for preparing a lot of the, the stuff here. Art, made a document around the whole architecture which I could reference in the one hour of preparation I did before this. So that that helped me as well. Claude code made. It. Thank you, club. All right, thanks, everyone.